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NON-SCHOOL ENGLISH OF HIGH-SCHOOL STUDENTS 


E. J. ASHBAUGH 
Ohio State Uniwersity 

Waar is the real measure of our teaching—thinking, develop- 
ment, ability to adjust, attitudes, habits, or skills? Whatever we 
may assume to be the function of the educational process, we must 
evaluate our teaching through some measure of reaction. 

Is the real measure shown by what the child does in the class- 
room for the teacher? This is probably the best condition under 
which to ascertain to the fullest extent the knowledge he has ac- 
quired in any particular subject and the skill which he can show. 
Is the best measure what the child does on a standardized test given 
by the principal, supervisor, or other school official? This gives 
us data by which to measure the efficiency or ability of our children 
as compared with children of other schools tested in like manner. 
Is it the knowledge he ean reveal when he is quizzed at home? 
This is the measure upon which the schools are frequently criti- 
cized or commended. 

It would seem, however, that none of these is the best measure 
of the efficieney of our teaching. What the child does when he is 
“on his own’’; the habitual reactions when he is thinking about 
something else; the standard he considers sufficient when he knows 
he is to be judged only by his peers—these are the crucial measures 
since they represent the fruit of the teaching effort, borne when 
teacher and school are no longer among the stimulating factors. 

This paper presents in a preliminary way the results of an 
effort to ascertain this measure of our teaching efficiency in the 
field of written English. Specifically, what is the level of English 
usage which children reveal when the school and the teacher are 
out, when the thought is uppermost, when the recipient is to be 
another child? 

It was coneeived that the one form of written English which 
would fulfill these conditions was the letters which children wrote 
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to their friends and sent through the mails. The problem was lim. 
ited to children of the junior and senior high-school grades and the 
letters were collected through English teachers in a large number 
of cities. 

The teachers were requested to ask their pupils to turn in, for 
examination, letters which they had received from their friends 
who were in either junior or senior high schools at the time the 
letters were written. The children were assured that the letters 
would not be published nor any use made of them which could 
possibly embarrass either the writer or the recipient. The purpose 
of the study was fully explained to be an effort to ascertain the Eng. 
lish usage, in their personal correspondence, of young people of 


TABLE I 


Worp AND SENTENCE LENGTH oF LETTERS WRITTEN 
BY Hicu-Scuoor Pupits 


Grade | Grade | Grade 
| ‘Il IX | XII 
ae A oe 
Total number of letters. . | 100 | 100 | 100 
Average number of words | | 
per letter | 212 273 | 278 
Average number of words | 
per sentence ; 10; I | 12 
Average number of words | | 
per paragraph unit 24 | 31 | 38 
Average number of para- 


graphs per letter 8.8 8.8 7.3 





these ages and school grades. Each letter was marked with the 
grade, sex, and age of the writer. 

Something more than fifteen hundred letters have been received, 
but the material herewith presented is taken from one hundred 
letters from each of Grades VII, IX, and XII. The letters in 
each grade were taken from the age-group containing the largest 
number and represent mainly children of thirteen, fifteen, and 
seventeen years of age. The three hundred letters were written by 
52 boys and 248 girls—most of them by girls to girls. The content 
is mostly about school, parties, vacations, and family affairs. In 
the seventh grade, very few of the letters seem to have been writ- 
ten because of any real joy in writing. Rather it seemed to be an 
unpleasant duty. This situation is less true in the ninth grade. 
Usually in the twelfth grade, the writer has a bit of news which 
she really wishes to communicate to her friend. 
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Table I gives a clearer idea of the letters. The average num- 
ber of words per letter increases somewhat as we go up in the grades, 
though the difference between the ninth and twelfth is not signifi- 
eant, and the sentences become slightly longer. The number of 
words in the paragraph units increases, in spite of the fact that the 
number of paragraphs per letter is the same in the seventh and 
ninth grades and is actually fewer in the twelfth. These para- 
eraph units are in terms of the paragraph divisions that should 
have appeared in the letter on the basis of the content and not 
simply on the basis of the paragraphing which the pupils did. 

Table II shows that paragraphing is not one of the strong 
points in English form for children of high-school age. No indenta- 
tion or other indication of paragraphing appeared to set off 35 
percent of the paragraph units in the letters of the seventh grade. 
There was 22 percent of omission in the ninth and 16 percent in 


TABLE II 
PERCENT OF ERRORS IN PARAGRAPHING AND 
SENTENCE STRUCTURE 























Grade Grade | Grade 
VI | xX | XU 
= (1) 2) | @ | @ : 
Paragraph errors........... 35 22 16 
Poor sentence structure. 6.1 | 6 5.8 
ak — ec aieeiaaie 
| | 4 9 


Grammatical errors per letter. 1.2 





the twelfth. Evidently some progress is made in teaching children 
to paragraph units of thought although the twelfth-grade letters 
are still far from perfection. 

‘Poor sentence structure in which are included run-on sentences, 
double negatives, meaning not clear, and so forth, is a rather con- 
stant quantity being found in, roughly, 6 percent of all the sen- 
tences in each of the grades. Grammatical errors, which included 
the use of wrong mode, tense, case, and number, and lack of agree- 
ment between subject and verb, occurred on the average of one 
time per letter. The incorrect use of tenses, which includes both 
failure to use proper tense as compared with the rest of the letter 
and the wrong form of principal part with the auxiliary or the 
perfect participle without an auxiliary, accounted for 45 percent 
of all of the grammatical errors noted. The lack of agreement of 
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subject and verb accounted for 24 percent of the grammatical er 
rors. Since this measures language usage when the attention , ; 
the child is upon thought expression, it is extremely doubtful tha: 2 
° ° . . ° . W. 
these grammatical errors are indicative in any sense of a lack o 
knowledge of formal grammar or that more teaching of forma 
grammar would result in fewer errors of this type. 
TABLE Ill at 
PERCENT OF ERRORS IN TERMINAL PUNCTUATION Mi 
Grade Grade Grace \s 
Vil 1X XII 
t 
; 
Period errors: | } 
Declarative sentences 23 15 16 
Abbreviations 19 16 | 19 
Interrogation marks 34 30 26 
TABLE IV 
ERRORS IN THE USE OF THE COMMA 
| Grade Grade Grace 
| VI IX | XI 
| 
! 2) | @ ro 
Heading, address, ete.: | 
Number needed ‘ | 347 354 328 
Percent of errors 58 51 45 
Series: r 
Number needed | 185 145 143 Like 
Percent of errors 22 | 19 11 + 
Parenthetical words, phrast s, | St 
clauses: | 
Number needed | 162 239 | 245 
Percent of errors S7 OS 71 l 


Table III presents data on errors in use of the period and 
terrogation mark. There are three possibilities with respect 
these marks used where they should be, omitted where they shou shy 
be used, and used where they are not needed. The figures ! 


report on omission only. Twenty-three percent of the declara' apap: 
sentences used in the seventh-grade letters did not close with 8 — 
period. The percents for the ninth and twelfth grades are 15 an ¥ 
16, respectively. The figures for the omission of the period wit! Ps 

} avou’ 


abbreviations and omission of the interrogation mark are rea 
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the same manner. It should be noted that the figures for the 
twelfth grade are slightly higher for both uses of the period than 
for the ninth. Whether this represents an actual situation which 
will be found when the larger study is completed, remains to be seen. 

The other mark of punctuation which received careful analysis 
was the comma. Three uses of the comma were considered, using 
only those about which there is practically no disagreement among 
English authorities. Table IV presents these data, giving not only 
the pereent of error of omission but also the frequency with which 
these marks were needed in the one hundred letters for each grade. 
As will be seen from the table, there were 347 commas needed in the 
heading, address, and complimentary closing in the seventh-grade 
letters, and 58 pereent of these commas were absent. The figures 


TABLE V 


PERCENT OF ERRORS IN CAPITALIZATION, USE OF 
APOSTROPHE, AND SPELLING 














Grade Grade Grade 

Xx XII 
ee ee ~ 3) cc 
Capitalization.............. 7.7 4.8 4.9 

Apostrophe: 
Contractions............. 18 14 17 
OO ee 63 50 d 

Spelling errors. . . a 2.6 IP 1.4 





for the ninth and twelfth grades are read in the same manner. 
Likewise, the use of the comma in a series shows the need and 
the percent of eases in which the comma did not appear. A third 
use, parenthetical words, phrases, and clauses, which includes words 
in apposition, and exclamatory and explanatory phrases and 
causes, is shown in the same manner. The high pereent of error 
in such instanees as the heading and address, might be explained 
on the basis of carelessness. The highest percent, however, is in 
these parenthetical expressions and probably results not only from 
carelessness but from a lack of knowledge. 

Table V presents further errors in form. In 7.7 pereent of the 
occasions in seventh-grade letters when capital letters should have 
been used they were omitted. The percents for ninth and twelfth 
grades are 4.8 and 4.9. A large number of capitals were used when 
they should not have been. These average in the seventh grade 
about one wrong use for every fourteen right; in the ninth grade, 
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one wrong to nine right; and in the twelfth grade, the same ratio 
exists as in the seventh. The apostrophe has two uses, to indicate 
a contraction and to show the possessive. The omission of the 
apostrophe in these two types of usage is also shown in Table VY, 
Note the large percent of error in the use of the apostrophe to 
show possessives. On the average for all letters the children were 
as likely to omit the apostrophe as to use it. 

The spelling errors in these letters were carefully checked on 
a strict dictionary base. The percent of total number of words 
used which were misspelled is shown in Table V. Note that the 
percent of errors decreases from the seventh to twelfth grades, from 
2.6 percent to 1.4 percent. We have set such a high standard of 
accuracy for spelling in correspondence that any misspelling at al! 
is irritating. Whether we have a right to demand greater accuracy 
in the seventh grade than 97.4 percent and in the twelfth grade 
than 98.6 percent may be open to discussion. Of course, expressed 
in another way, this represents an average of five and one-half 
words misspelled in each seventh-grade letter; four and one-hal! 
words, in each ninth-grade letter; and four words, in each twelfth 
grade letter. Unquestionably, we are disturbed when we receive 
a letter which contains four or five misspelled words. The num- 
ber of misspelled words, however, is no more disturbing than the 
words which are misspelled. Of 403 different words misspelled by 
the ninth grade, 28 were proper names. Ruling these out of con- 
sideration, of the 375 remaining, 148, or 39 percent, are among 
the first thousand words most frequently used in correspondence 
If we consider only the 42 words of this group which were used 
two or more times in each letter in which they were found mis- 
spelled less frequently than used, 29 are in the first thousand and 
39 in the first five thousand. A cursory examination of the words 
misspelled in the twelfth-grade letters shows a large number of 
one-syllable words. ‘‘I’m,’’ ‘‘I’ve,’’ ‘‘I’ll,’’ and so forth, written 
without the apostrophe. ‘‘Too,’’ ‘‘their,’’ and others of their kind 
appear also. Very few difficult words appear, though ‘‘ visualize,” 
‘‘immediately,’’ ‘‘chivalrous,’’ ‘‘chautauqua,’’ and ‘‘stationery 
do oceur oceasionally. 


SUMMARY 


Whether the complete analysis of the more than two thousand 


letters would show the same general condition as these three hun 
dred remains to be seen. Whether this condition is good or bad 
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ean hardly be determined, sinee only subjective standards have 
been used and these standards have tended to be perfection. <A 
funetional standard to determine the amount of error or the type 
of error which may exist without affecting the purpose of composi- 
tion has not yet been determined. We do not even know the extent 
to which such errors, as have here been shown, can be reduced by 
ereater emphasis upon the formal and technical side of English 
writing, nor do we know that such reduction would be worth the 

‘ime expended. It is highly probable that these results are due 

much less to lack of knowledge than to lack of a high personal 

standard of usage on the part of the children. In summary, the 

English used by children of Grades VII, IX, and XII in their 

letters to their friends may be characterized as follows: 

1. Most of the letters are written by girls to girls. 

” Most seventh-grade letters seem to have been written from a 
sense of duty while most twelfth-grade letters had a message 
which the writer wished to convey. 

. The length of letter, whether measured in terms of number of 
words, length of sentence, or length of paragraph, increased 
from the seventh to the twelfth grade. 

4. Paragraphing markedly improves from the seventh to the 
twelfth grades, although there is still a high pereent of error 
in the latter grade. 

\. Periods are omitted both after abbreviations and declarative 
sentences at the rate of one to every five or six places in which 
they are needed. 

(. The interrogation mark is omitted even more frequently than 
the period. 

i. Commas are usually omitted in headings, addresses, after com- 

“plimentary closings, and in parenthetical expressions. 

‘. The pereent of running words (not the different words) mis- 
spelled was 2.6 in Grade VII, 1.7 in Grade IX, and 1.4 in 
Grade XII. This sounds very small, but when turned into 
the number of misspelled words per letter it averages five and 
one-half words in each seventh-grade letter, four and one-half in 
each ninth-grade letter, and four words in each twelfth-grade 
letter. The majority of the misspelled words are found in the 
five thousand words most frequently used in correspondence. 











CAN OBSERVATION BE TRAINED IN SCHOOL CHILDREN: 
W. H. WINCH 
Inspector to the London County Counetl, London, England 


(Continued from April) 
Il. A SECOND EXPERIMENT: SCHOOL Y BOYS 


General plan of the expertment.—TueE second experiment to 
be deseribed was also earried out on the same general plan in the 
spring and summer of 1910. A whole class of boys (Standards 
VI and VII), under one teacher, was divided into two equal and 
parallel groups on the basis of the work done in three preliminary 
tests in observation. Then one group worked six practice exercises 
in arithmetic, whilst the other had six practice exercises in obser 
vation. The two groups again worked together in three final tests 
in observation. Some important differences, in the standard, age 
and sex of the children and in the conduct of the practice exercises, 
will be noted as the description of the experiment proceeds. 

The children who did the work.—This experiment was carried 
out with the first class of a medium-sized boys’ school, in the inner 
ring of the southeastern suburbs of London. The school was the 
one really poor school in a fairly good district and was not in a hig 
condition pedagogieally, But the first class, consisting of Star 
ards VI and VII, was a good one and was taught by a very abl 
man who had had some experience in the methods of experimental 
pedagogy. In considering this work in relation to that of the girls 
previously described, it must be remembered that it was done by 
children who were further advanced in their standards, wer 
year or more older, and was that of boys, who are not natura 
so proficient as girls in observational work. But they were 
structed and corrected in the course of both sets of their practic 
exercises, arithmetical and observational, which the girls had 1 
been. They were shown the errors in their arithmetical and in the 
observational work, both collectively and individually. Anothe 
important difference, also, they were, in their tests, allowed a fixed 
time for the writing up of their observations, namely, forty-f 
minutes. They were told before every fresh test and exercise t! 
exact marks they had scored in the previous test, the marks 
arithmetic being so arranged that they were numerically of th 
same apparent value as those obtained in the practice tests in 0! 
servation. First-rate work carried approximately the same marks 
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in both. In all other school work during the period of the experi- 
ment—some four months—the two groups worked according to the 
same time-table and, as I have said, under the same teacher. One 
other point of difference may be mentioned. Two of the practice 
exercises in observation were based on actual constructions made 
hy the teacher, to wit, a capital model of a farmyard with the ap- 
propriate buildings and stock, and a model of a section of a railway 
line with rails, signals, engines, carriages, and station. 

Chronology of the experiment.—All the work was done in the 
afternoons, at 2:15. In the practice exercises, ten minutes, as in 
the previous experiment, were allowed for observation. In all the 
tests, 45 minutes were allowed, and no more, for writing up the 
observations. In the practice exercises in arithmetic and observa- 
tion, a very few boys, who said they could do some more, were al- 
lowed to go on up to a limit of 65 minutes. The special arithmetical! 
work covered exactly the same time as was allowed for the observa- 
tions themselves plus the written account of them. After the reg- 
isters had been marked at two o’clock and all papers had been 
prepared, a few words were spelled from memory until the exact 
moment for beginning the tests and exercises. 


|. Preliminary Tests (Groups A and B): 


1) A section of a neighbouring high road.......... Thursday, Feb. 24 
2) A local municipal bath...........ceseeeeceees Friday, Feb. 25 
TE ee Thursday, Mar. 4 
Il. Praetice Exercises (for Group A only): 
oe Fee PTT Tere ee Thursday, Mar. 11 
2) A picture of the First Crusade............... Thursday, Mar. 18 
3) A model of a section of a railway line......... Thursday, Apr. 7 
4) A piece of chemical apparatus in action....... Friday, Apr. 22 
5) The teachers’ room in the girls’ school........ Thursday, Apr. 28 
6) Pieces of dissociated physics apparatus........ Thursday, May 5 


I. Final Tests (Groups A and B): 
1) A motor omnibus (then much rarer than now).. Thursday, May 12 
2) A sectien of a busy street in the immediate 
neighbourhood of the school ........ ...- Thursday, May 19 
A local shop for stationery, sweets, and toys....Thursday, June 2 


Method of marking the papers.—The boys’ papers were marked 
precisely as described in the preceding experiment. One or two ex- 
tracts from them may be of service. The first extract is from one 
of the two weakest papers in the particular exercise. 

Tom B—, in his third practice exercise, wrote: 

‘By looking at the front and left side of the station on may see to holes 


and by looking through these one may see a small bell. 
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In front of the station are some round train lines. On these lines is stand 
ing a train which goes by clockwork. 
Attached to the train is a coal-truck and in it is some tin painted black. 


? 


On the truck is painted a yellow ‘‘Great Northern Railway’’ and on the 
train is painted in yellow a number, 770. 

Behind the truck is a carriage with four windows on each side. Standing 
by the lines are three soldiers two with their guns down and the third who is 
standing in the middle with his gun in a position of fireing,’’ and so on for 


rather more than a page of foolscap. 


Tom scores a mark for ‘‘station,’’ one for ‘‘left,’’ and one for 
‘‘front,’’ and marks for ‘‘holes,’’ ‘‘two,’’ ‘‘bell,’’ ‘‘small,’’ 
‘‘through these,’’ ‘‘in front,’’ ‘‘train-lines,’’ ‘‘round,’’ ‘‘train,”’ 
‘fon these lines,’’ ‘‘goes by clockwork,’’ ‘‘coal-truck,’’ ‘‘attached 
to the train,’’ ‘‘in it,’’ ‘‘tin,’’ ‘‘ painted black,’’ ‘‘on the truck,’’ 
‘*painted yellow,’’? ‘‘number,’’ ‘‘yellow,’’ ‘‘770,’’ ‘‘on the train,”’ 
earriage,’’ ‘‘windows,’’ ‘‘four,’’ ‘‘on each side,’’ ‘‘behind the 
truck,’’ ‘‘soldiers,’’ ‘‘three,’’ ‘‘standing,’’ ‘‘by the lines,’’ ‘‘two,’’ 


” 


‘‘with the guns down,”’ ‘‘the third,’’ ‘‘in the middle,’’ ‘‘gun,”’ 
‘‘in a position of firing.’’ 

Fred T—, in his last final test, belonging to the non-practised 
group, but unpaired, wrote in a paper much below the average: 


Blanks have a window on each side of the door. On the right handsid 
window there are postcards and birthday greeting cards at the price of one 
penny each. There are rows of six hung up. Over each row there is a ticket, 
on the ticket is painted in blue one penny each. In the window there are 
game of all sorts such as Ludo, Snakes and Ladders, Draughts. There ar 
story books for children also toys. Some of the toys are engines, motor 
omnibuses, taxicabs, cabs, a piece of tin with a man with a stick hitting a 
woman. . . «+ +» 

Then Fred, after some more about this window, deals with the 
boards exhibiting placards outside the shop and passes on to the 
other or sweet-stuff window; then goes inside and describes the 
counter with the boxes and bottles on it and behind it. He writes 
for one page and a third of foolseap. 

Marks are scored for ‘‘window,’’ ‘‘on each side,’’ ‘‘door,”’ 
‘right handside,’’ ‘*posteards,’’ ‘‘birthday eards,’’ ‘‘one pen 
’* ‘over each row, 


, 


each,’’ ‘“rows.’’ ‘six, ’ ‘‘hung up,”’ ‘ticket, 
*Great Northern Railway was not on the truck, there were the 

G. N. R. The words ‘‘Great Northern Railway’’ were not allowed by the 

teacher as an observation; he said it was contrary to the instructions he had 

given, which were to cite observations only, without inference. 
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‘painted blue,’’ (the ‘“‘one penny’’ is not, of course, scored again), 
‘‘oames,’’ ‘‘in the window,’’ ‘‘of all sorts,’’ ‘‘Ludo,’’ ‘‘Snakes 
and Ladders,’’ ‘‘Draughts,’’ ‘‘books,’’ ‘‘story,’’ ‘‘for children,’’ 
‘“toys,’’ ‘‘engines,’’ ‘‘motor-omnibuses,’’ ‘‘taxi-cabs,’’ ‘‘cabs,’’ 
the common cab was not then a taxicab), ‘‘a piece of tin,’’ ‘‘man,’’ 
“with a stick,’’ ‘‘hitting, woman,’’ 


, 


99 66 


TABLE III 


Tre AVERAGE Marks OBTAINED BY THE Boys or Group A, 
THE PRACTISED GrRoUP 























Average Mark Rensebinen 
Initials of Pupils for Practice Exercises Gain ieapovanant 
24 a ee 
aoe sae ae (3) | (4) (5) 

ee Fede 111 | 146 | 35 32 
D. J 118 137 19 16 
S.G 82 110 | 28 34 
M.A 80 94 14 18 
K. A 92 133 41 45 
S. I 133 184 51 38 
W.A 93° Left 
E. J 79° Left oe: Pe y 
W.W 75* 87* 12 16 
M.A 73 94 21 29 
R. I 48* BE oR. oa weblcvueaereo sae recs 
* 75 98 23 31 
B. W 81° Left aa an Raa are SERENE he Pde 
B. T 59 64 5 8 
P.W 69 54 —15 —22 
W.W j 69 77 8 12 
M.B 57 77 20 35 
M.R 61 71 10 16 
R.R 4 57 3 6 
S.S 46 69 23 50 
H.A 73 80 7 10 
M.W aid 39 57 18 46 











*T wo only 
- *One only 





One more extract will be given, this time from the first prac- 
tice exercise by one of the abler boys, though by no means one of 
the best at this work. The reader will be left to mark it for himself. 

Albert K— wrote: 

‘*Before me is a small house around which are many cattle. This is a 
farmhouse which is three stories high, has two roof windows, a small tower 
which is surmounted by a guilt point. In the second story are eight windows 
two at side and six in front. On the ground floor are two double doors. There 
sasmall path round the front which is sheltered by a shed. At one end a small 
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: tethered by a narrow strap. On its back is a blue blanket with whit 


The horse is a grey colour. At the opposite end is a small tree wh 


main boughs. By the side of this are two cock fowls. At the oth 
the tree are a black and black and white goats which appear to be 


’ and so on for rather less than a closely written page of fo 


TABLE IV 


Totat MARKS OBTAINED IN STANDARDS VI anp VII, Scuoon Y Boys, in Ture: 
PRELIMINARY TESTS IN OBSERVATION AND THREE FINAL TESTS, AFTER 
Six Practice Exercises spy Group A tN OBSERVATION AND BY 
Grour B In ARITHMETIC 


Unpaired 
Unpaired 
Unpaired 
M.S 
. E 


Unpaired 
Unpaired 
L. W 
F. W 


Results.—I give first the improvement shown during the | 
tice exercises in observation. 


Comments on Table IIT: 


1. The percentage of increase from the first, second, and t! 
practice exercises to the fourth, fifth, and sixth practice exe! 
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eises ranges from 50 to 6 percent. There is one case in which 
ss occurs, that of P. W. I eannot account for it; he is a 
steady worker. 

The r coefficient between the two series of averages is .94: their 
mean difference is 18, with a probable error of 2.345. 

mments on Table IV: 

Several boys in both groups have been left unpaired. 
and P.M. have been paired partly on their teacher’s judgment, 
with S.E. and W.A., as the former were not then attending the 

their marks for the other two tests in the prelimi- 


S.J. and 


local baths: 


ary series were practically identical with those of S.E. and W.A. 


TABLE V 
or Group A, SECTION BY SECTION, OF THE PAIRED CASES, IN THE 
PRELIMINARY AND FINAL TESTS 


p A—PRACTISED Grove B—Non-PRACTISED 


Average : Average | 
Number Mark | atm | a we Mark Average 
Preliminar are ‘ Preliminary} on 
Boys . —~, M Final Tests | Bovs omy |! inal T 


| 
eats 
(3 


One of the boys, F. W., in Group B, took out newspapers daily 


ong that seetion of the road which was to be deseribed in the 
second final test; his reeord was 135 as against 45 and 37 in 
his other tests of the final set. 

\ boy in Group A, W.W., asserted that he had only once seen 
shop which he was required to describe: he had not, he said, 


the 
riven much attention to it: his mark was 80 as against 113 in 


the preeeding test. R.E., also in this section, had been absent 


during a long period prior to the final tests, his absence in- 


ving four of the practice exercises, but K.C. in Group B, his 


ired associate, would most probably have beaten him in any 
“ase, 

lhe r correlation coefficient between the two series of total marks 
nu the preliminary tests is .962 calculated from the nearest tens: 
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their mean difference is .3, with a probable error of .18. The + 
coefficient between the two series of total marks in the final tests 
is .362: the difference between the means is 6.1, with a probable 
error of 2.48, 

5. There is a significant differential gain by Group A of 22 percent. 


Comments on Table V: 

1. As has been already pointed out, the second section of Group A 
(counting downwards) was unfortunate; the best boy in the 
section had left before the final tests, and the last boy was ab- 
sent from school for a long period involving four out of the six 
practice exercises. 

General conclusions from the work of the boys in school Y. 
The boys of the practised group have gained very decidedly as com- 
pared with those of the non-practised group. In this experiment 
the boys were taught by collective and individual correction how 
to state their observations so as not to introduce irrelevancies 
The general conclusions are as before. No profitable conclusions 
ean be drawn as to the cause of the differences between the previous 
results for girls and these for boys. The boys were of an inferio 
social class, they were in higher ‘‘standards,’’ and older; moreover 
their exercises in observation were corrected, those of the girls wer 
not. These differentiating conditions were employed in order that 
the conclusions, if any, from the series of experiments should not 
depend on any of them. 


IV. A THIRD EXPERIMENT: SCHOOL Y GIRLS 


General plan of the experiment.—The general plan of this ex 
periment was the same as in the previous one and was carried out 
in the same school-building. A whole class under one teacher, o! 
girls on this occasion, was divided into two equal and paral 
groups on the basis of preliminary tests to show whether the chi! 
dren were more or less observant of the things which lay around 
them in their daily lives. Then one group was given practice exer 
cises in observation whilst the other group worked arithmetical 
examples. I had not intended that the non-practised group should 
work at arithmetic in this school; my intention was that the girls 
of that group should read very long stories and reproduce in writ- 
ing as much of them as they could remember whilst the practic 
group were writing up their observations. I did not think that 
the comparative results would be much, if at all, affected by this 
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change. For these boys and girls were well practised and proficient 
in English composition, and, even were they not, it was rather in 
the type of statement required than in general fluency of expres- 
sion that improvement would influence these results. The exercises 
in fietional reproduction might, of course, be obstructive rather 
than helpful. But, in any case, they were not given; for the 
teacher in the girls’ school, who was to use the special models con- 
structed by the teacher conducting a similar experiment in the 
boys’ school, followed him also in adopting ‘‘arithmetic’’ as the 
medium of practice for the group not to be trained in observation 

a procedure contrary to my written note. Some differences in 
the conduct of this experiment from that in the boys’ school as 
to the time of day, previous lessons, and other matters will be 
shown in the more detailed description which follows. 

The girls who did the work.—The work was done with the first 
class of a municipal girls’ school comprising Standards VI and VII, 
taught by one teacher, who was first-rate as a teacher and experi- 
enced in experimental work. It was part of the girls’ school cor- 
responding to the boys’ school which has been described in the 
previous experiment, and oeceupied part of the same school-build- 
ing. Both of the teachers were in the front rank; the boys and 
girls belonged to the same or similar families and came from the 
same homes, and the girls were approximately of the same ages as 
the boys. 

Chronology of the experiment.—The boys in the corresponding 
class in this school had worked the tests and exercises in the after- 
noons, the girls worked theirs in the mornings, beginning at 10 a.m. 
The ordinary lessons, according to the school time-table, were given 
rom nine to ten o’elock. Seripture lesson ended every morning 
at 9:40, and the lesson from 9:40 to 10 a.m. on Wednesday, the 
day of the week on which we started our experiment, was nature 
study, involving some observation of plants. This was unfortunate, 
and we avoided Wednesdays after the first and second of the pre- 
iminary tests. The time-table lesson from 9:40 to 10 a.m. on 
Thursdays was the recitation of poetry, and that for Friday was 
physical exercises. The time allowed for the tests, both preliminary 
nd final, was forty-five minutes from 10 to 10:45 a.m. For the 
practice exereises in observation, ten minutes were allowed for ac- 
‘ual observation, and an unlimited time for writing up the observa- 
lions, but it was rare that more than forty-five minutes were thus 
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occupied. For Group B, the non-practised group, a sufficient num. 
ber of arithmetical exercises was given to occupy the children for 
the same time and to the same extent as those of Group A. After 
the papers were marked, the girls were told, during the ensuing 
week, but not just before the next test or exercise, the number of 
marks they had scored in the previous test or practice exercise. No 
teaching or correction of any kind was given or made in connection 
with any test or exercise throughout the whole of the experiment 
There was thus more stimulus of a disciplinary kind than in the 
work done in the girls’ school previously cited, but no teaching or 
correction of errors as in the corresponding boys’ school. Th 
dates, tests, and exercises in observation follow. 


I. Preliminary Tests 
1) A tramear Wednesday, Feb 
2) An open park-like space near the school Wednesday, Feb 
3) A section of the neighbouring High street ..Thursday, Fel 
II. Practice Exercises 
1) Model of a farmyard Thursday, Mar 
2) Picture of the Battle of Trafalgar Thursday, Mar 
3) Model of a railway line Thursday, Mar 
Easter Holidays 
The teachers’ room Thursday, 
Objects used for drawing obtained from the 
DOGS” GONOGE oo sc ceecceccccsscceccccsese Friday, 
Groups of toys, laid out on a large table Friday, 
A: RORSS GOIN occ vccdiccceseccsvsctessces Thursday, 
Section of a busy street near the school Friday, 
Whitsun Holidays 
A eentre for instruction in laundry Thursday, May 


The marking of the tests and exercises.—The method of mark 


ing was precisely the same as that adopted in the previous ex 


W 


periment. Two or three brief extracts from the actual papers fol! 
Lily P—, in her second preliminary test wrote: 

‘*The picture is about Nelson and some other men in a ship. The s! 
looks as if it were going to sink. Their are eighteen men on the ship. | 
the picture it says ‘‘ England expects every man to do his duty. Nelson is talk 
ing to three other men, one of men as a sword in a ease. And Nelson 
three medals on. Two men are pulling the strings of the sails. They all ! 
serious and frightened. Their are two other ship near by,’’ and so on ! 
page and a third of foolseap. 


Marks are given for ‘‘picture,’’ ‘‘Nelson,’’ ‘‘men,’’ ‘‘ship,’ 
The ship does not look as if it is going to sink, so! 


”” 


**in a ship. 
mark is awarded for this description. A mark is given for ‘‘eig 
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een,’ but not for ‘‘men’’ or ‘‘in the ship,’’ because these observa- 
tions have been scored before. ‘‘England’’ scores, and ‘‘expects,’’ 
“eyery,’’ “‘man,”’ ‘‘to do,’’ ‘‘his duty’’ also score. ‘‘Nelson’’ does 
not score again, but ‘‘talking’’ does, and ‘‘men,’’ ‘‘medals,’’ 
“three”? and ‘‘on.’’ ‘*Two,’’ ‘‘pulling,’’ ‘‘strings,’’ ‘‘sails,’’ 
‘“oll,”’ ‘‘serious,’’ (we did not think the men did look frightened), 
“two,’’ “‘ships,’’ and ‘‘near by’’ also score. 


Ellen R—, in the first of her final tests, wrote : 

‘An omnibus which is pulled by horses has four large wheels, and on the 
front is a seat for the driver which is as high as the top of the omnibus. There 
are a number of seats and each seat is room enough for two persons. On the 
outside is a number of advertisements of which some were large. On the top 
there is room for 20 persons and inside carries 16 persons. Inside the bus 
n the little windows are small advertisements, these little windows were col- 
ured. The seats are made of wood and there are red cushions on these and 
the seats upstairs are made of wood. There are two steps which reach to the 
nside and there are another number of steps which reach to the top. The 
driver holds the reins,’’ and so on for a closely written page of foolscap. 


‘‘Omnibus pulled by horses’’ does not score for that is given 
in the question, but marks are obtained for ‘‘wheels,’’ ‘‘large,’’ 
“‘four,’’ ‘seat, on the front, for the driver,’’ ‘‘as high as,’’ 
‘“top,’’ “‘seats,’’ “‘number,’’ ‘‘room enough,’’ ‘‘people,’’ ‘‘two,”’ 
‘‘outside,’’ ‘‘advertisements,’’ ‘‘some,’’ ‘‘large,’’ ‘‘on the top,’ 
‘nersons,’’ ‘‘20,’’ ‘‘inside,’’ ‘‘16,’’> ‘‘ windows, ’’ ‘‘inside,’’ ‘‘little,”’ 
‘‘advertisements, ’ small,’’ ‘‘eoloured,’’ ‘‘seats,’’ ‘‘of wood,”’ 
‘‘eushions,’’ ‘‘red,’’ ‘‘seats,’’4 ‘‘of wood,’’ ‘‘steps,’’ ‘‘two,’’ ‘which 
reach,’’ ‘‘to the inside,’’ ‘‘steps,’’ ‘‘another number,’’ ‘‘which 
reach,’’ ‘‘to the top,’’ ‘‘driver,’’ ‘‘holds,’’ and ‘‘reins.’’ 


99 66 yr 4 


;° 


, 


, 


b ee 


Sarah P—, in her sixth practice exercise, wrote: 

‘*On teacher’s table was bear pulling a cart, then there was another cart, 
two cups, two saucers, a tea-pot, a money-box, a basket, a tede bear, a guine- 
pig, a rabbit skin, and a rabbit sitting on a piece of wood. The cart the bear 
was pulling was made of tin with three bells on the top the wheels of the cart 
are made of tin painted red. The bear had brown fur with a piece of brown 
lether round its neck, over its back there was a piece of iron and that fastened 
n to the cart,’’ and so on for a page and a half of foolscap. 


*I am doubtful of these numbers, but they were scored as correct at that 
time by the teacher. 
‘Seats have been observed in two places and are different in appearance. 
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I leave the scoring to the reader. 

Results.—In this case I give only one compendious table; the 
necessary statistical cheeks worked out from the individual figures 
will be given in the subjoined comments on the table. 


Comments on Table VI: 

1. The r correlation coefficient between the two series of average 
marks in the preliminary tests is .94: their mean difference is 
1, with a probable error of .29. 
The r coefficient between the two series in the final tests is .12: 
the mean difference is 15, with a probable error of 3.9. 


TABLE VI 


AVERAGE Marks or Group A ON THE THREE PRELIMINARY TESTS AND OF 
Group B on THE THREE PRELIMINARY AND THREE FINAL TESTs, 
AS WELL AS DesIGNATEeD Groups OF PRACTICE EXERCISES 


GrovurpA—Non-PRAcTIsep Group B—PRACTISED 





} ; 
| Average | = rage Average; Average Mark | Average 
Number} Mark | ark |Number| Marks | Practice Exercises Mark 
of Prelim | Final | of Prelim- | 3. SE Fina 
Girls inary | Tests Girls inary 6 Tests 
} Test | | Tests a | 
= - SSS | 


(1) az 3) 6 6|lC4Y 


125-150 ; 43 82 
100-125 j 38 86 

75-100 7 31 94 
Below 75 . 22 | 74 


106 
104 


ts 

= 

ia , won . 
150 and over.... é 57 | 98 ‘ 
| 





= oa 


Average. , ‘ 40 | 89 


*Qne child was absent in the final tests 
bOne child left school during the practice exercises 


The r coefficient between the two series of average marks for 
the first, second, and third, and the fourth, fifth, and sixth pra: 
tice exercises is .72: the difference between the means is 40, wit! 
a PE of 2 

It will be remembered that there was a time limit for writing 
up the observations in the preliminary and final tests; but not 
in the practice exercises; also that the preliminary and final 
tests dealt with the observations which had or had not been 
made in the ordinary course of daily life; the observations for 
the practice exercises had been made ad hoc 
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There is a differential statistically significant gain of 55 per cent 
by the practised group; the non-practised group gain 111 per 
cent in their ‘‘finals’’ on their ‘‘preliminaries;’’ the practised 
group gain 166 per cent. 


General conclusions from the work of school Y girls—As in the 
two experiments previously described, the training by practice exer- 
cises in observation has produced a definite, and indeed, a great 
improvement in the observation of things which have always been 
in the way of being observed by the children and which may have 
already been noticed by them with more or less attention and preci- 
sion. Of course, the tests themselves, apart from the exercises, 
form a training not to be neglected. The practice exercises are 
only more definite, more limited, and more precise than the observa- 
tions of daily life. The instructions in this school, as in the others, 
for both tests and exercises were the same, ‘‘ Write down what you 
have yourself observed; no marks will be given for anything else.’’ 
Doubtless there is also training in the tests, not only as a stimulus 
to actual observation, but as bringing into clear consciousness and 
remembrance the results of those observations which may, or may 
not, have been forgotten in the interim, a process which is a part, 
and a useful part, of what I have called the functional activity 
of remembering, and which I believe to be a common element in 
all memory work. Direct mnemonic training in the practice exer- 
cises, of course, is of short-period memory only. 


Some pedagogical inferences.—This Journal is not the place to 
elaborate pedagogical inferences from the foregoing experiments; 
but a word or two may be necessary to prevent misunderstanding. 

I advance no conclusions concerning the ‘‘training of the fae- 
ulty of observation.’’ I regret the phrase; it has had unfortunate 
pedagogical implications, as all conceptions of training in vacuo, 
apart from content, must have. But there is a place in the early 
school lives of children for the more or less unspecialized observa- 
tion of things around them, apart from and in addition to the ob- 
servations required in learning to read, write, and do arithmetic. 
\nd, later on, when the boy or girl settles down to more definite 
subjects of instruction, there is still a place, a vital place, for 
direct observation. Nature talks, for example, are an abomination. 
Moreover, provided that we do not cultivate observation to excess 
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in any one direction—a difficult thing to do with children, though 
easy for the specialized adult—we shall find spread and transfer 
to other fields, unless we make our exercises too long or place them 
too closely together in time, not only of the content of our observa. 
tions, but also of our methods and of the perceptual activity itself 





oeff 


Aces 


tang 
Is a 
douk 
the | 
meal 
are 

A 
whie 
to pe 
that 

} 


inde 


equal 
to b 
vious 
the n 
f de 


tweel 


Subst 





ough 

nsfer THE CORRELATION COEFFICIENT AND ITS 
them PROGNOSTIC SIGNIFICANCE 

eT Va- 

'tself CuarK L. HULL 


Uniwersity of Wisconsin 


I. TRIGONOMETRIC AND OVERLAPPING RELATIONS AMONG TESTS AND 
APTITUDE CRITERIA 


THERE are numerous points of view from which the correlation 
coefficient may be regarded; one of these is that of trigonometry. 
According to this view, the correlation coefficient is the natural 
tangent of one of the lines of regression with the vertical, say. This 
is a fundamental fact. To the trained mathematician this fact is 
doubtless illuminating and satisfying, but for the psychologist or 
the student of educational research, to whom mathematies is a 
means rather than an end, the trigonometric aspects of correlation 
are likely to be neither illuminating nor satisfying. 

A second point of view considers the relations among the factors 
which may produce a tendency to correlation. This is of interest 
to persons concerned with the theory of mental testing. Suppose 
that two variables are each produced by the joint action of ten 
independent factors or determiners, that each determiner is of 
equal importance, and that two of these determiners are common 
to both variables. This overlapping of the two variables will ob- 
viously produce a tendency to correlation. If we let N,. represent 
the number of common determiners and N, and N, the total number 
of determiners in the respective variables, then the correlation be- 
tween the two variables will be given by the formula, 


N s 
T — ee -- - = 
VN, ‘NN, 
Substituting the above values in this formula we have 
2 
V10-10— 
= 0.20. 


By means of the same formula it may be shown that where two 


_ ‘Brown, William M., and Thomson, Godfrey H. Essentials of Mental 
veasurement. London, Cambridge, University Press, 1921. p. 176. 
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groups of ten factors are involved, as assumed above, the over. 
lapping of one factor gives an r of .10; three factors, .30; five 
factors, .50; six factors, .60; seven factors, .70; and so on. In 
short, the correlation coefficient in such a situation as is assumed. 
is equivalent to a simple percentage statement of the degree of 
identity of the determiners of the respective variables. 

It should be observed, however, that to say that the determiners 
of two variables are identical to the extent of 50 percent is by no 
means the same thing as to say that the resulting variables will re- 
semble each other to that extent, or that knowing one the other can 
be estimated with that degree of efficiency. It will be shown pres- 
ently that this is far from being the case (see Table IV). The in. 
ference that human behavior is organized on any such simple 
basis must not be implied from the supposititious example just given 
Indeed, there is every reason to believe that two human acetivi- 
ties will rarely be of the same complexity, that two deter- 
miners will rarely be of equal importance within a given activity, 
and that a given determiner found in both activities will rarely 
have the same relative importance in each. It is possible to com- 
pute the correlation among variables as complexly related as those 
just described. Such computations show that even under the mor 
complex conditions assumed, it still remains substantially true that 
the correlation coefficient is at least a minimum percentage state- 
ment of the amount of identity of the determiners of the respective 
types of activity. This means that a correlation of +-.50 will al- 
ways mean an identity of more than 50 percent among the de- 
terminers of two variables and sometimes considerably more. This 
fact is of fundamental importance for any comprehensive theory 
of behavior tests and testing. 


ll. NUMBER OF SUBJECTS RIGHTLY PLACED BY A TEST PROGNOSTICALL\ 
AMBIGUOUS 

To the applied psychologist the correlation coefficient is likely 
to be of interest largely because it is functionally related to the 
forecasting efficiency possessed by various tests with which he 
may be working. While not the most significant, probably the 
most common, form in which the question of forecasting efficienc) 
has been raised is: If a test or battery of tests correlates with a 
criterion to the extent of .60, say, what percent of those individuals 
forecasted by the tests, as likely to fall within a given range of the 
criterion, will really fall there? Let us take, for example, the 





May, 


Nat 
mar 
indi 
ot t 
fall 
the 
and 
stra 
In t 
indi 
vari 
fall 
sulti 
Thon 
app! 
2 SI 
prise 
stan 
pon 
N 
to fa 
pon’ 
will 
tion 
tion | 
tion 
is .&f 
1.00 
is kn 
tidn | 
» ste 
tance 
ratio 
we fi 
mean 
Thus 


at 
1923. 
a: 


1922. 





Oover- 


five 


. In 
imed, 


ee of 


Liners 
Dy no 
ill re- 
r can 
pres- 
1¢€ in- 
imple 
riven 
etivi- 
leter- 
ivity, 
arely 
com- 
those 
mort 
that 
state- 
tive 
Il al- 
e de 
This 


1e0ry 


‘ALL\ 


ikely 
» the 
h he 
r the 
lency 
ith a 
duals 
f the 
, the 





May, 1927 CORRELATION COEFFICIENT 329 


National Intelligence Tests and the ordinary five-step scholastic 
marking seale of A, B, C, D, and E. Of those people who are 
indieated by the tests as probably falling in the middle of any one 
of the five zones of the marking scale, what percent will actually 
fall at least somewhere within the zone? If, as is usually the ease, 
the two distributions involved are approximately homoscedastic 
and homoclitie? and the two regression lines are approximately 
straight, the question is susceptible of a fairly definite answer. 
In the case just mentioned, for example, the number of people 
indicated by the tests as probably falling in the middle of the 
various zones of the scholastic marking scale and who will really 
fall somewhere within the zone, is approximately 38 percent. 

The method by which this figure is reached involves the con- 
sulting of a table of the probability integral such as is found in 
Thorndike’s Mental and Social Measurements. It will be seen that 
approximately 96 pereent of the normal distribution falls within 
2 SD of the mean. This indicates that a total range of 4 SD com- 
prises practically all of the normal distribution. Accordingly, the 
standard deviation of a distribution of criteria ranging over a 5- 
point scale will be roughly one-fourth of 5 or 1.25 points. 

Now, if a large number of individuals are predicted by a test 
to fall at any given place on a scale, few will fall at exactly that 
point, though the average of the group will tend to fall there. Some 
will be better and some worse than predicted, and their distribu- 
tion will be approximately normal. In general, the standard devia- 
tion of such a prediction group is \/1—r2 times the standard devia- 
tion of the eriterion. Since in the above example r is .60, V1? 
is 80, and .80 * 1.25 (the standard deviation of the eriterion) is 
1.00 which is the standard deviation of any prediction group. This 
is known as the standard error of estimate. We are now in a posi- 
tién to determine what percent of any prediction group falls within 
) step of where predicted, that is, of its mean. Dividing the dis- 
tance (.5 step) by the standard deviation (1.00) we obtain the 
ratio of .5. Looking this up in Thorndike’s table, just mentioned, 
we find that approximately 19 people in 100 will fall between the 
mean and one-half step on either side or 38 people on both sides. 
Thus we arrive at the 38 percent just mentioned. 





* Kelley, Truman L. Statistical Method. New York, Macmillan Company, 
1923. p. 172. 
, *Thorndike, E. L. New York, Teachers College, Columbia University, 
322. p. 219. 
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It is important to observe that, while the actual precision of 
forecasting is always the same for any given situation, the percent 
of predicted individuals falling within any class interval increases 
as the scale is coarsened and decreases correspondingly as the seale 
is made finer. As an illustration of this, consider the same situa- 
tion just described except that the criterion is on a coarse 3-point 
seale of good, fair, and poor. In this ease the standard deviation 
of the new marking scale is one-quarter of 3 or .75. 

That is, 

SD V1—r =.75 X .80 
‘ = .60 
. step + .60— .83 (ratio). 





TABLE I 


CORRELATION AND COARSENESS OF A SCALE 























CoaRsENESS OF CRITERION SCALE 
r 
10 points 5 points 3 points 
es ee BEST ee (3) ES 
MB in ee 16 31 50 
.40 17 34 53 
50 18 35 56 
60 20 38 59 
.70 22 42 65 
, eee 26 50 73 
. ore 35 64 87 
(is «ones 48 SO 97 














Looking up this ratio in Thorndike’s table, we find that approxi- 
mately 59 people out of a hundred in such a prediction group would 
fall within a half-step of where predicted, on a 3-point seale. The 
inerease from 38 percent to 59 percent is merely the result of 
coarsening the seale, and in no sense indicates any increase oi 
forecasting precision. In a similar manner it might be shown that 
if the seale is made finer so that 10 points are used instead of 3 
and everything else remains constant, the percent of people fall- 
ing within a half-point of where predicted, shrinks.to 20 per 
cent. The corresponding percents resulting from the combinations 
of seales of various degrees of coarseness and correlations of var'- 
ous sizes, are shown in Table I. 

It is important to observe that the percent of individuals fall- 
ing within a half-point of where predicted on any particular scale 
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is not a measure of the forecasting efficiency of a test. To dissi- 
pate any such illusion it is necessary only to observe that a test 
correlating zero, that is, having no efficiency whatever, would place 
31 pereent correctly on a 5-point scale and as many as 50 percent 
on a 3-point seale (see Table I). At the other extreme it may be 
observed that a test correlating .95, on a 10-point scale, would place 
only 48 percent correctly. Such paradoxical results make it quite 
clear that despite the apparent simplicity of the notion, the per- 
cent of individuals correctly placed is by no means a satisfactory 
index of the forecasting efficiency of a test. This is true to an 


TABLE II 
MrntaTuRE Set or Data SHow1nc A SERIES OF CRITERION ScoRES AND Two 
Series or Test ScorRES 


























Subject Criterion Test A Test B 

Number (Xe) (X)) (Xa) 
= (1) (2) (3) (4) 
re re 2 8 8 
See 4 5 5 
AR 6 7 2 
.. Peers 13 9 14 
ee ee 10 ll 6 
Means.... 7 8 7 
Ss koa es 4 2 4 











even greater degree where, instead of raising the question in terms 
of a linear seale, it is put in terms of ranks or percentiles which 
obviously cannot be treated like ordinary units.‘ 


lll. PERCENT OF PERFECT FORECASTING EFFICIENCY (E) BASIC IN THE 
TESTING OF APTITUDE 


The real forecasting efficiency of a test is at bottom a function 
of the amount of error resulting when the test is used to forecast 
or estimate a eriterion. The basie notion is simple. If the error 
made by a battery of tests correlating zero with a criterion, that 
is, a battery of no efficiency whatever, were 16 and the error made 
by a fairly good battery were 12, the second battery would clearly 


*A series of tables based on percentiles has been published by Landis, 
M. H., Burtt, H. E., and Nichols, J. H., ‘‘The Relation between Physical Efti- 
ency and Intelligence,’’ American Physical Education Review, 28: 220-21, 
May, 1923, 
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have reduced the forecasting error by 4 points out of a possible 16. 
The test is therefore, in a perfectly simple and direct manner, 25 
pereent efficient. Similarly, if the forecasting error made by a 
battery were 14.4 points when predicting the same criterion, then 
this battery would serve to reduce the forecasting error by 1.6 
points out of a possible 16. In a perfectly simple and direct man- 
ner, this second battery would therefore be 10 percent efficient, 
Fortunately, a simple formula based on the correlation coefficient 
expresses exactly this relation. Letting E stand for forecasting 
efficiency. It is, 

fe eg ee Te rie (1) 
where E represents the percentage efficiency of a test in predicting 
a criterion. 

That this formula expresses the actual percent of reduction in 
the foreeasting error, or error of estimate as it is sometimes called, 
may be seen by a series of computations based on the miniature set 
of data shown in Table II. With these particular data, forecasts 
or estimates of the criterion may be made by substituting the test 
scores of any given subject in a prediction formula worked out 
for these particular data. It is, 

X, — .968 X, + .381 X, — 3.41 
substituting the test scores of subject No. I we have, 

X, = .968 «x 8 + .381 x 8 — 341, 
and solving, 


Ae — .738. 


The estimate or forecast, in this case 7.38, is rather wide of the 
mark since the true criterion score is 2. The error in this case is 
therefore 5.31 points. The remaining forecasts, made in the same 
manner, are shown in Table III, Column 3, and the corresponding 
errors are shown in Column 4. For purposes of comparison the 
estimates which would be made by a similar forecasting formula 
for a battery having a zero correlation with the criterion, are given 
in Column 6 and the corresponding errors in Column 7. 

A preliminary view of the forecasting efficiency of the tests 
given in Table II is shown by comparing the errors in Column 4 
with those in Column 7. The latter averages 3.6 points of error, 
whereas the former averages 2.15 points. The tests have thus re- 
duced the average amount of error by 1.45 points of a total possi- 
bility of 3.6 points. From these results an approximation to the 
forecasting efficiency might be computed directly. It is customary, 
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however, for statisticians to use the root-mean-square error of 
forecast rather than the simple average as we have done here, just 
as they usually prefer the standard deviation rather than the mean 
variation as a measure of variability. While the standard devia- 
tion measures the same thing as the mean variation, it does it 
somewhat more reliably. So the root-mean-square error of fore- 
cast or the ‘‘standard error of estimate’’ as it is called, measures 
the same errors of forecast but in general does it somewhat more 
reliably. To secure this figure the errors are simply squared, the 
squares averaged, and the square root taken of this average. These 


TABLE III 


Frrors oF Estimate oF A Batrery CorRELATING .719 WITH THE CRITERION 
AND FOR COMPARISON THE Errors OF EsTIMATE OF A BATTERY 
CoRRELATING ZERO WITH THE CRITERION 























Battery SHown tn Taste II 
CorrRELATING .719 WITH Batrery CorreE.LaTING ZERO 
CRITERION WITH CRITERION 
Subject | Criterion ee Error of Error —— Error of Error 
No | (Xe) = Forecast | Squared = Forecast | Squared 
| (Xo) (Xo) 
|) __ (3) (4) (5) (6) (7) (8) 
| mp « P « - Pe oe 
I sane 2 7.38 5.38 28.9444 é 5 25 
I] 4 3.33 .67 4489 7 3 9 
IT] ; 6 4.13 1.87 3.4969 7 1 1 
IV | 13 | 10.64 | 2.36 | 5.5696) 7 65 36 
V | 10 9.52 48 2304 7 9 
poe 
Mean... 7 7 2.15 7.738 7 3.6 16 
Square root! 
OF PE i. i sio'n.om cbaiccc eacidincacese Ses Apter eee 4 























results for the two sets of errors are shown in Columns 5 and 8. 
The standard errors of estimate thus secured are 2.78 and 4.00, 
respectively. The tests of Table II accordingly reduce the stand- 
ard error by 1.22 points or 30.5 percent. In a very simple and 
direct sense, therefore, the battery of tests which correlates .719 
with the eriterion is 30.5 percent efficient. The correlation coeffi- 
cient is thus seen to be in no sense a percentage statement of the 
forecasting efficiency of a test or a battery of tests. In this case, 
for example, the efficiency is less than half the size of the corre- 
lation coefficient. 

The computations carried out here have been given largely for 
purposes of explanation. As a matter of fact, the two standard 


er 
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errors of estimate, and consequently the forecasting efficiency, ean 
be obtained much more simply from a knowledge of the standard 
deviation of the criterion and the correlation coefficient. The 
formula is: 

SE of BE. = 8D V1 —F . 2 ccccccccccccccccecsl? 
Since the first battery correlates .719 with the criterion, the stand. 
ard error of estimate is, 

SE of E. 


— 4 V1—.719" 
= 2.78, 


and the second, 
SE of E.=4 V1— 00 
= 4, 
exactly as was found by the direct computation. From these figures 
the foreeasting efficiency can be computed just as in the last 
paragraph. 

There is a yet simpler method of determining the percent of 
forecasting efficiency. It will be observed by an inspection of 
formula (2) that where r is .00 the standard error of estimate is al- 
ways the same as the standard deviation of the criterion. Thus in 
the example just given the standard error of estimate was 4 which 
is the same as the standard deviation of the criterion (see Table II). 
Accordingly, what we really have been doing in securing the per- 
centage efficiency is to subtract the standard error of estimate of 
our battery from the standard deviation of the criterion and then 
divide this difference by the standard deviation of the criterion. 
Letting E stand for forecasting efficiency, in symbols this is: 

SD—SD y1—Fr 
- ; 
Cancelling out the SD’s from the above expression we have, 
E—1—yi-—r 
which is the standard formula (1) for the forecasting efficiency 
of a test, as mentioned above. Substituting the r of the battery 
under consideration, this becomes, 
E=1— v1—.719 
= 1— .695 
= .305 or 30.5% 
exactly as by the more laborious method. 

It should be added that the standard error of estimate is some- 
times used to indicate the efficiency of a test. It has the advantage 
that it is basic as already pointed out in connection with the ex- 





May, 1 


plana 
fluenc 
from 
abnor 
tain 

puryp 
stand 
fact 1 
unit | 
some 
the n 


ft ivec 


A 
to fol 
psveh 
the ex 
of the 
correl 


No. § 
, Can 


dard 
The 


and. 


ures 
last 


ney 


ery 


age 


eXx- 





May, 1927] CORRELATION COEFFICIENT 335 


planation of the nature of &. Moreover, it is relatively little in- 
duenced by the fluctuation in the size of r which sometimes results 
from the subjects tested being, as a group, abnormally narrow or 
abnormally wide in their range of abilities. Undoubtedly, for cer- 
tain purposes these considerations should be decisive; but, for 
purposes of comparison of the efficiency of tests in general, the 
standard error of estimate has a fatal defect. It consists in the 
fact that it is measured in no constant unit. For example, the 
unit in one ease will be a seale of 5 points, in another it may be 
some kind of a percentage seale, in a third it may be in terms of 
the number of work pieces correctly turned out or of salaries re- 
ceived and so on. In addition the standard error of estimate is 


TABLE IV 
RELATION OF THE CORRELATION COEFFICIENT TO THE PERCENT 
OF FORECASTING EFFICIENCY 
i aan 
r 1] r . 
(Percent) 


(1) 2) 


at no 
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:. ie 2 eS 40 
. 5 e eer 56 
Miiait 2, Bee... 69 
50... .| 13 FF ls 80 
20 =—s/}/1.00...... 100 


.60 


1 
| 
| 


a negative form of statement, the larger the figure the less fore- 
casting efficiency, which makes rather awkward thinking. 

In this connection it should be observed that, theoretically, for 
the E in the above formula to be strictly accurate the r upon which 
t is based should be corrected for any abnormality in the varia- 
bility of the subjects from whom it was obtained. 


IV. RELATION OF E TO r AND ITS IMPLICATIONS AS TO FORECASTING 
POSSIBILITIES 


A systematie view of the relations of the correlation coefficient 
0 forecasting efficiency is shown in Table IV. To the applied 
psychologist perhaps the most striking thing about this table is 
the extremely slow rise of efficiency in the region of the lower half 
of the range of r. The forecasting efficiency rises as much between 
correlations .98 and 1.00 (20 points) as it does between zero and 
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distinctly not to be understood as indicating that the psychologist 
is reaching only from 13 to 30 percent of the factors determining 
human behavior. It has already been pointed out in an earlier 
section of the present paper that the correlation coefficient is an 
under statement of the percent of agreement or identity of de- 
terminers of criteria and test. This means that when a psychologist 
secures a battery of tests correlating .70 with a criterion, there is 
probably around 80 or 85 percent in common between the causal 
determiners of the two activities. The paradoxical and dishearten- 
ing thing is that even so, the forecasting efficiency will be less 
than 29 percent. 

This raises the question as to how low forecasting efficiency a 
test may have and still be useful. The answer obviously depends 
upon how much it costs for test materials and administration on 
the one hand, and upon how desirable is the type of information 
sought. In most situations, however, it is doubtful whether a 
forecasting efficiency of less than about 13 percent (r=—.50) will 
make the giving of tests worth while. 

In this connection the question is often asked, ‘‘ What is a high 
and what is a low correlation coefficient’’? Varying answers have 
been given from time to time. In the light of the foregoing we 
may make a number of fairly definite statements regarding the 
significance of correlation coefficients of various sizes, as related to 
practical testing. These may be summarized as follows: 

Below .50, practically useless for prognosis 

From .50, to .60, possibly useful 

From .60, to .70, of genuine but limited value 

From .70, to .80, of decided value but rarely or never found 

Above .80, not obtained by present methods 
It thus appears that as a practical proposition, useful tests are 
confined to a narrow zone of efficiency between 13 percent and 
J pereent which corresponds to correlations roughly between .50 
and .70. Jt is doubtful whether any prognostic test ever has 
risen consistently above an efficiency of 30 percent. It is always 
hazardous to predict a limit beyond which science cannot attain, 
but the present indications are that unless some more or less rad- 
ical improvement in test construction is discovered, psychological 
tests will be forever doomed to operate at an efficiency under 50 
pereent, probably under 40 percent, and very possibly the average 
efficiency will not rise much above 25 percent or 30 percent. 

It ean hardly be doubted that almost up to the present time 
a widespread misapprehension has existed as to the significance 
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of the correlation coefficient as regards the forecasting efficiency 
of tests and batteries of tests. The fact, that the correlation coeff. 
cient ranges between zero and 1.00 combined with the universa] 
habit of thinking in terms of percents, quite naturally leads people 
to assume in a more or less vague manner that an r of .50 indicates 
a forecasting efficiency of 50 percent instead of the 13 percent which 
it does, and that an r of .20 indicates a forecasting efficiency of 
20 percent instead of 2 percent which it does, the latter an over- 
estimate of 900 percent! This gross misconception has had two 
effects. On the one hand, it has doubtless stimulated very greatly 
the sale and use of tests which would have been used sparingly 
if at all, if their real efficiency had been realized. On the other 
hand, when the grossly exaggerated expectations arising from this 
misconception were followed by the conerete realization of the 
prognostic limitations of the tests as shown by use, many persons 
have gone to the opposite extreme and insisted that tests were of 
no value whatever. Upon the whole, however, the net result has 
undoubtedly been to create an exaggerated optimism regarding the 
value of tests which has greatly aided their development. But sci 
ence can hardly connive at a popular misconception. 
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TEACHING SPELLING BY COLUMN AND CONTEXT FORMS 


PauL McKEE 
Colorado State Teachers College 


(Continued from April) 


In last month’s issue of this Journal was described the method 
ysed in a series of three experiments investigating the general 
problem of relative efficiency in teaching spelling by column and 
context forms, together with the results of the column-phrase ex- 
periment. In this article the two remaining experiments, the col- 
umn-sentenee and the column-paragraph, will be accounted for. 

Each experiment consisted of eight lessons. For each of these 
lessons a column form and the corresponding context form were 
constructed and used as lessons sheets during the appropriate period 
of the investigation. 

Two groups of seventh-grade children, 275 in number, were 
formed upon the basis of equal spelling ability as judged by a re- 
liable preliminary test. For the first four lessons of each experi- 
ment, one group used the column form and the other group the 
appropriate context form; during the next four lessons the groups 
were shifted. 

The method of teaching used was that given in Lippincott’s 
Horn-Ashbaugh Speller.1. By means of this method it was possi- 
ble to measure improvement made during the study and testing of 
each weekly lesson. The only variation from this method lay in 
the fact that pupils studied the words in context form and were 
tested in context form during the week of teaching. 

Tests of delayed recall were administered nine weeks after the 
teaching of a given lesson had been completed. These tests con- 
sisted of a column form and a new corresponding context for each 
lesson. Both forms of the test were given to both groups of pupils. 
In this way it was possible to obtain a measurement of the ability 
of both groups to spell words previously studied when a period of 
time had elapsed after the teaching process had been completed, and 
also to procure a measurement of the ability of both groups to 
spell words previously studied when used in a new context form. 

From the results of the column-phrase experiment, it is clear 
that the Column Group learned more words during the teaching 


_ ‘Horn, Ernest, and Ashbaugh, E. J. Lippincott’s Horn-Ashbaugh Speller. 
Vhicago, J. B. Lippincott Company, 1920. pp. vii-xx, 60. 
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of the weekly assignment and showed greater spelling ability in 
the tests of delayed recall, than did the Phrase Group. The two 
groups showed approximately equal ability to spell words previ. 
ously studied when used in a new phrase form. 


RESULTS OF THE COLUMN-SENTENCE EXPERIMENT 


Measurement of improvement.—The three methods of determin. 
ing relative improvement used in the column-phrase experiment 
are also used in the column-sentence experiment. In the first phase 
of this experiment the groups were treated as units, and proper 
measures of central tendency and variability were computed. (Col- 
umns 2, 3, 4, and 5 of Table V* show the results obtained by the 
use of this method. 

The Column Group procured a mean score on the initial test 
of 12.67, while the Sentence Group obtained a score of 12.60. The 
difference of .07 was in favor of the Column Group with a prob- 
able error of + .41. Similarly the mean scores on the final test 
were 17.66 and 17.30, respectively, the difference again favoring 
the Column Group. The other facts recorded in the table will be 
similarly read. The differences which were significant are itali- 
cized in the table. 

From an examination of these columns of Table V it is clear 
that in four of the six lessons in which the two groups show an 
approximately equal ability on the initial test, the Column Group 
shows significant superiority in improvement made during the 
week. In two lessons the groups show an approximately equa! 
amount of improvement. There is some evidence in the case of 
Lesson 16 that the Sentence Group is superior in the amount of 
ability acquired. These facts indicate the Column Group was 
superior to the sentence group in the acquisition of spelling ability 

Table VI shows the results obtained when the second method 
of determining relative improvement, that of sectioning the mem- 
bers of the groups by their scores on the initial test, was used. The 
numerical limits of the sections were the same as those used in the 
eolumn-phrase experiment and are indicated at the top of the 
table. Section 1 of the Column Group secured a mean improve- 
ment during the week which was .34 greater than that secured by 
the corresponding section of the Sentence Group. There are 44 


*The data summarized in these columns were originally arranged in 4 
tabulation similar to Tables I, III, and IV of the first part of this article. 
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eases in the column section and 35 in the sentence section. See. 
tion 2 of the Column Group secured a mean improvement during 
the week which was .12 greater than that secured by the similar 
section of the Sentence Group. There are 38 cases in the column 
section and 36 in the sentence section. The rest of the table is read 
in a similar manner. The differences in improvement which were 
significant are set in italics throughout the table. 


TABLE VI 
COMPARISON OF IMPROVEMENT OF CORRESPONDING SECTIONS 


















































Section | SEcTION 2 Section 3 SectTion 4 SEcTION 5 
(16-19) (12-15) (8-11) (4-7) (0-3) 
Group 
No | Gain No. | Gain | No Gain No Gain No Gain 
—_@ £#|@| ® [@] ®& (6) (7) (8) (9) 10) | 0 
Lesson 9: | 
Column... | 35 841961... 27 .02 | 10 | 1.97 7 | 1.85 
Sentence | 28 14} OTM 1......4 BL.....) 6] 
Lesson 10: | | 
Column. 31 .83 | 38 | 1.48 | 31 | 1.18 | 11 .32 3 
Sentence 7 39 eA oe ie .:  3| eS i ae | 5 1 ¢ 
Lesson 11: 
Column 131} .20/26| .70]13| 1.67 | 3] 1.00] 4 | 
Sentence | 28 4 if Fame 4 3] 4 
Lesson 12: 
Column... | 40 | 1.87 | 29 | 1.68 | 19 | 1.68 | 10 2.3 3 | 1.6 
Sentence | ot PES 5 SE he as | eee 6 | 
Lesson 13: | | 
Column......} 44] .03|41| .04]17] .35| 8| .56] 5 
Sentence 25 o Pe bones yy ae ae es ae 
Lesson 14: 
Column. . 44 41} .64/14/1.76] 5 | .91] 5 | 8.8 
Sentence......| 19 -06 | 33 |... ft eee See 2 | 
Lesson 15: | 
Column......} 44 ..| 37 | 1.06 | 21 | 1.97 | 21 91 
Sentence | 19 Se Dr ¢ ae | ee 6 ae’ 
Lesson 16: F 
Column......| 39 |... i.... 9/ 1.16 | 6 | 4.17 | 2 | 4.2% 
Sentence 134] .96| 40] .74 | 23 | 2 |... 4 





This table shows the column sections were superior to the sen 
tence sections in the majority of cases. This condition indicates 
that the column section secured greater improvement than the 
sentence sections during Lessons 9 through 16. 

The comparisons, taking into aecount that percent of pupils 
who improved or failed to improve, may be summarized briefly 
The Column Group was superior to the Phrase Group in all the 
lessons of the column-sentence experiment when judged by four 
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different eriteria, namely, (1) continuous improvement during the 
week, (2) improvement from the initial to final test, (3) improve- 
ment from the initial to the second test, and (4) improvement from 
the second to final test. In six of the eight lessons the Column 
Group had a larger pereent of pupils who procured perfect scores 
on the second test, while in all the lessons this group had a larger 
percent of pupils who procured perfect scores on the final test. 
These facts, which show the superiority of the Column Group in 
all items of comparison, seem to indicate greater improvement upon 
the part of the Column Group. 

Measurement of ability to spell when a period of time has 
elapsed.—Columns 6 and 7 of Table V show a comparison of the 
ability of the two groups to spell words previously studied nine 
weeks after they were formally taught. An examination of Col- 
umns 6 and 7 will show that in only two of the eight lessons is there 
a significant difference between the mean scores. In these two 
eases the Column Group is significantly superior to the Sentence 
Group. When the scores of all eight lessons are pooled, in such 
a way that a distribution of each group over all the lessons as a 
unit is obtained, and the comparison is made between the mean 
scores of the two groups, the result is decidedly in favor of the 
Column Group. These facts indicate that the Column Group was 
superior to the Sentence Group in the ability to spell words previ- 
ously studied nine weeks after the formal teaching process had 
been completed. 

Measurement of ability to spell words previously studied when 
presented in new sentence form.—The tests of delayed recall given 
nine weeks after the lesson had been taught included a column 
test and a new sentence test for each lesson. Both the column and 
the sentences tests were given to both groups of pupils. By com- 
paring the mean scores procured by the Column Group on the sen- 
tence tests with that of the Sentence Group procured on the same 
test one obtains a comparison of the ability of the two groups to 
spell words previously studied when presented in sentence form. 
Columns 8 and 9 of Table V show this comparison. 

In only one case does a significant difference exist between the 
means. In this ease the Column Group is significantly superior to 
the Sentence Group in its ability to transfer. In four of the re- 
maining seven lessons the mean score of the Column Group is only 
slightly higher than that of the Sentence Group. Pooling the les- 
sons does not show a significant difference. These facts seem to 
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indicate that the two groups possess about equal ability to spell 
words previously studied when presented in new sentence form. 
The results of this experiment provide three outstanding facts: 
First, the Column Group acquired more spelling ability during the 
teaching of the weekly lessons than did the Sentence Group. See- 
ond, the Column Group shows an ability greater than that of the 
Sentence Group to spell words previously studied nine weeks after 
the formal teaching process has been completed; and third, the 
group that studied by sentence form showed no greater ability to 
spell words previously studied when placed in new sentence form 
than did the group that studied the same words in column form. 


RESULTS OF THE COLUMN-PARAGRAPH EXPERIMENT 


Measurement of improvement.—In the third experiment con- 
trasts are made between words taught in columns and those pre- 
sented in corresponding paragraph forms. Columns 2, 3, 4, and 5 
of Table VII give the means of each group on the initial and final 
tests, the difference in favor of one or the other, and the probable 
error of the difference. Again, those differences which are statis- 
tically significant are italicized. 

In only Lesson 18 and Lesson 20 did the two groups begin the 
week’s work with an approximately equal initial spelling ability. 
On the final test of these same two lessons, the group which studied 
by column form showed a decidedly significant superiority over the 
group which studied by paragraph form. The results of the com- 
parison of the two groups in Lesson 18 and Lesson 20 show clearly 
that the Column Group made much more improvement during the 
week’s work than did the Paragraph Group. 

Table VIII presents the results of comparing relative improve- 
ment of the two groups when divided into subsections using the 
scores made on the initial test as the basis of the division. The 
limits of each section are shown in the table. The results obtained 
show that the column sections are decidedly superior in the com- 
parison of Sections 1 and 3. In the comparison of Sections 2, 4, 
and 5 the column sections frequently show superiority while the 
paragraph sections show greater mean gains in only a few instances. 
It should be remembered that the reliability of these results is lim- 
ited by the small number of cases involved in the sections. 

The advantage of the two groups judged by the percent of 
pupils who improved or failed to improve may be summarized as 
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follows: (1) The Column Group was superior to the Paragraph 
Group in all the lessons when compared by the four criteria preyi. 
ously used, namely, continuous improvement during the week, im- 
provement from initial to final test, improvement from initial to 
second test, and improvement from second to final test. (2) In six 
of the eight lessons the Column Group has a larger percent of pupils 
who procured perfect scores on the final test. (3) The Column 


TABLE VIII 
CoMPARISON OF THE AVERAGE Gains MapE By Eacu oF THE Two Groups 





























Secrion 1 | SecTion 2 Section 3 Section 4 SEcTION 5 
| (16-19) | (12-15) (8-11) | (4-7) (0-3) 
Groups ee | ae 
| No. | Gain No. | Gain No, | Gain | No. Gain | No Gain 
ak ma 2 | 3 | (4) _& | @] @) _ (8S) | @) | ao] 
Lesson 17: | 
Column. . 44) .60/)38| .12/ 14] .25/ 13 ]......] 3 
Paragraph 35 | 36 paans 26 [e+ | 13 | .61] 6] 2.49 
Lesson 18: | | | 
Column......| 27 | .28 | 36 | 1.56 | 34 | 2.02 | 18 | 1.19| 4 
Paragraph. ...| 25 |.  ) ae | 31 | | 17 | | 7] .89 
Lesson 19: | | | 
Column. . 16 .| 14 | | Oi.. oe 
Paragraph....) 12} .15 | 21 | .73 {21 | .16 | 21 |......) 11} 1.3 
Lesson 20: ms ee | 
Column......| 50} .95 | 25 | 1.39 | 30 | 1.28/10 | 1.15] 1 
Paragraph... .| 34 |......| 39 | pt ae > ee BY 
Lesson 21: | | | 
Column. | 33 .82 | 38 32 | 1.12 | 13 | 3.70 | 4 a4 
Paragraph 16 | 125 | .87 | 44 - oar 13 
Lesson 22: | 
Column. . | 49 | 1.66 33 | = 15 |1.09)}15 | .42) 5 | 6.55 
Paragraph | 21 ...| 34 .O1 | 39 -oe-} 16 | | § 
Lesson 23: | 
Column 135] .62| 39 | .98| 27 | 1.91 | 11 | 1.26} 2 {13.17 
Paragraph ....| 30 3 eee 7 eee sg 7 
Lesson 24: | 
Column 56 | .29| 30) .44/17 / 1.48 | 9 | 6.02) 1 /13.50 
Paragraph | 26 34 oy Fs | 5] | 8 


Group also showed a larger percent of pupils who procured pertect 
scores on the final tests in all the lessons. A summary of these com- 
parisons favors the Column Group. 

Measurement of ability to spell when a period of time has 
elapsed.—Columns 8 and 9 of Table VII present the comparison 
of the ability of the two groups to spell words previously studied 
nine weeks after the formal teaching process had been completed 
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The data summarized in these columns clearly indicate that in the 
ability to spell words previously studied the Column ‘Group is de- 
cidedly superior to the Paragraph Group. 

Measurement of ability to spell words previously studied in new 
paragraph form.—The last two columns of Table VII provide a 
comparison of the abilities of the two groups to spell words previ- 
ously studied in new paragraph form. Examination will reveal 
that in only one lesson is the difference between the mean scores 
of the two groups significant. In this case the Column Group shows 
superiority. In all other cases but one, the Column Group has a 
higher mean seore than the Paragraph Group, though the difference 
between the two means is not significant. When the scores for 
all eight lessons are pooled, the difference between the two means 
is significantly in favor of the Column Group. Obviously, those 
pupils to whom words were presented for study purposes in col- 
umn form and who were tested by tests in column form show 
greater efficiency in spelling these words in new paragraph form 
than do pupils to whom the words were presented in paragraph 
form and who were tested by paragraph form. 

The findings of the column-paragraph experiment may now be 
summarized: First, pupils who studied and were tested by the col- 
umn form seemed to acquire a greater amount of spelling ability 
during the learning process than did the pupils who studied by 
paragraph form. Second, the pupils of the Column Group pro- 
cured greater ability to spell words previously studied nine weeks 
after the words had been taught. Third, in the ability to spell 
words in new paragraph form the pupils of the Column Group 
were superior. 


SUMMARY OF RESULTS OBTAINED IN ENTIRE INVESTIGATION 


The total results of the investigation may be summarized in 

terms of the three stated definitions of efficiency as follows: 

|. In the column-phrase experiment, pupils who used the column 
form secured results superior to those obtained by pupils who 
used the phrase form in the amount of spelling ability acquired 
during the learning period, and in ability to spell words previ- 
ously studied nine weeks after the formal teaching process had 
been completed. The two groups showed approximately equal 
ability to spell words previously studied in new phrase form. 

2. In the column-sentence experiment. pupils who used the column 
form secured results superior to those obtained by pupils who 
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used the sentence form in the amount of spelling ability acquired 
during the learning period, and in the ability to spell words 
| previously studied nine weeks after the formal teaching process 
| had been completed. The two groups showed approximately 
equal ability to spell words previously studied when presented N 
in new sentence form. with 
3. In the column-paragraph experiment, pupils who used the col- unit 
umn form secured results superior to those obtained by pupils situa 
who used the paragraph form in the amount of spelling ability appe 
acquired during the learning process, and in the ability to spell at th 
words previously studied nine weeks after the formal teaching until 
process had been completed. They were also superior in the by “ 
ability to spell words previously studied when presented in new not 
paragraph form. unde 
The conclusion to be drawn from the results of the three ex- orga 
periments is that context exercises, as used in this investigation, qui] 
do not constitute a procedure in the teaching of spelling which is N 
as efficient as the common column form. When to the fact of their shou 
inferiority is added the amount of time and energy necessary for adeq 
the construction and administration of these context forms in the by tl 
classroom, they become not only inefficient but also impracticable ity 
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TYPES OF COUNTY EDUCATIONAL CONTROL 
IN THE UNITED STATES 


JULIAN E, BUTTERWORTH 
Cornell University 


Nor long ago the writer became involved in a friendly argument 
with a county superintendent as to the desirability of the county 
unit in edueational organization. As is not uncommon in such a 
situation, a state of mind was gradually built up in which each 
appeared to be more concerned in proving his point than in getting 
at the truth. There seemed little hope of either convincing the other 
until the question was raised by one as to just what the other meant 
by ‘“‘county unit.’’ It turned out that we were debating terms, 
not ideas; he was advocating a particular type of organization 
under one terminology, while I was advocating practically the same 
organization under another terminology. The argument died out 
quickly for lack of fuel! 

Not all the differences regarding the place which the county 
should have in educational administration are traceable to an in- 
adequate definition of terms. Some of them are caused, however, 
by the lack of sufficient explanation, and this article purposes to 
give a clarification of concepts so that we may see more specifically 
the questions at issue. 

It is not surprising that the term ‘‘county unit’’ has so many 
meanings. One of the important needs in American rural educa- 
tion has been to overcome the ineffectiveness of the small school dis- 
triet. Accordingly, whenever a state legislature gave the county a 
large measure of responsibility in financing the schools, in selecting 
teachers, in enforeing attendance, in developing a system of uni- 
form textbooks, and the like, there was a tendency to designate this 
asa county unit. A decade ago the number of states having county 
units was variously indicated as from fifteen to twenty. Writers 
sometimes distinguished between complete and partial or strong and 
weak systems according to the degree of responsibility given the 
county. One of the latest authoritative classifications, that by Mrs. 
Katherine M. Cook of the United States Bureau of Education, is 
much more rigid. She lists only ten states as having a county-unit 
organization and four as having combined county and district 
control. 





‘Rural School Supervision. Washington, D. C., Government Printing Office, 
1922. pp. 6, 7. (Bureau of Education Bulletin, No. 10) 
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There have been three main sources of confusion in the use of 
the term, county unit. In the first place, the degree of control 
which the county should exercise over the schools within its bor- 
ders—if the organization is to be called a county unit—has not 
been exactly defined. The extent to which control varies under 
present conditions is illustrated in Table I where data are 
given on the three states of Maryland, Alabama, and Montana. 
In Maryland, the county is divided by the county board of edu- 
cation into districts which are controlled by trustees whom this 
board appoints. The trustees act as custodians of the buildings, 
exercise direct control over the pupils, and have some voice in the 
selection of teachers. The fact that they are responsible to the 
county board of education and not to the’ people of the districts 
leaves authority on the most significant matters with the county. 
In Montana, the subdistrict of the county rural district—at present 
only one county operates under this law—has a board of trustees 
elected by the people. These subdistrict trustees select the teachers, 
control the pupils, administer the buildings, recommend the budget, 
and may raise additional funds. Montana, in leaving to the sub- 
district the selection of teachers, makes a very significant reserva- 
tion. Our judgment as to whether or not a county unit is desirable 
in a particular situation depends very much upon the type of 
authority which the county is given and the degree to which that 
authority may be exercised. 

A second point of confusion is found in determining what com- 
munities should be under the jurisdiction of the county in order 
to make a county unit. All villages and cities in Maryland, with 
the exception of Baltimore, are included in the county-unit organi- 
zation. A different type of county unit exists in Alabama, where a 
city of one thousand or more may be excluded from the county-unit 
scheme; and, by such arrangement, large local autonomy results. 

Some confusion, in the third place, has resulted from a failure 
to distinguish between the degree of control which is necessary 
to make a county unit and the organization which is desirable to 
bring about the most effective functioning of the unit. When the 
county unit has been proposed, it has often been defended on the 
basis of its providing a county board with authority to select a 
competent superintendent as its executive officer and an adequate 
foree of clerical and professional assistants. Yet, in the so-called 
‘*eounty-unit state’’ of Florida, the county superintendent is elected 
by the people, while in the non-county-unit state of Ohio he is 
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appointed by a county board. It is evident that there are two 
factors in making an effective county unit—the degree of contro! 
exercised and the form of organization needed to make that control 
function wisely. Of these factors, the degree of control to be given 
the county should first be determined, then the organization set up. 

The facts given in Table I show that the degree of county con- 
trol varies even among those states generally considered as having 
county units. Before a true evaluation of the influence of county 
control in a particular state can be made, it is necessary to have 
the detailed facts regarding the legal authority of the state. A 
classification which reveals significant differences in the state was 
made on the basis of the degree of control granted to the county and 
the nature and the size of the communities within its borders to 
which this control extends. Under such a plan we are separating 
entirely the problem of the degree of control from the problem 
of devising an effective organization. The various groups are desig- 
nated as types in the following outline: 


Type 1. County unit of the strong type: 
1. County controls all policies. 
2. All communities are included except cities. 
3. There are no lesser boards in the county district except those 
having advisory, clerical, or custodial functions only. 
4. States included in this group: Louisiana, Maryland, Utah. 
Type 2. County unit of the strong type: 
1. Control—similar to Type 1. 
2. The communities included are usually only common schoo! dis 
tricts and small villages. 
3. Lesser boards—similar to Type 1. 
4. States included: Alabama,3 Florida (see later discussion), 
Georgia,* Kentucky, New Mexico, North Carolina, Tennessee, 
Virginia (see later discussion). 


?In Utah, the local school district is the county in 24 cases, while in 4 
eases the «sounty is divided into 2 districts, and in one case into 3 districts. 
There are 5 districts ih cities having a population of more than five thousand 

*Towre of six thousand population in one of the counties may be inde 
pendent of the county organization, while in the other counties the limit is two 
thousand. One county has a county and city organization. In 1924-25, thr 
state report showed 28 towns having populations between one and two thousand 
which had an independent organization. 

‘Four counties have an organization of Type 1. In the other counties all 
places of ‘two thousand and more are independent and smaller places may be 
if granted @ special charter by the legislature. 

* Any incorporated town may be independent of the county board. There 
seems ta be no minimum population requirement for incorporation. According 
to the Fityte Department of Education 29 percent of the 95 counties do not have 
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Type 3. Semi-county unit or county unit of the ‘‘weak’’ type: 

1. Considerable authority is allowed the county, but at least, two 
of the following: selection of teachers; financing the schools; 
and control of such policies as consolidation, curriculum, etc. 

2. Communities included—usually as in Type 2. 

3. There are lesser boards within the county district. 

4. States included are Montana (optional),6 Nebraska (op- 
tional),?7 Oregon (optional) ,§ South Carolina. 

Type 4. Non-eounty unit: 
1. Varying degrees of control exist among the states but usually 
the county plays only a minor part. 
2. Communities included—usually as in Type 2. 
3. There are lesser boards within the county district. 
4. States included are Ohio, Mississippi, and other states with 
county as a school unit. 


Type 1 of our proposed grouping includes those states in which 
the county has control of all significant policies in all communities 
of the county except the large cities. All the cities in Louisiana are 
included, although Orleans Parish, in which is located New Orleans, 
has some special legislation. All Maryland cities except Baltimore 
are a part of the county organization, while in Utah only places of 
more than five thousand are independent. In county-control states 
of the first type there is no lesser board or, if there is one, it is 
usually appointed by the county board and its functions are merely 
advisory. It should be noticed that only in the state of Utah is 
every element of control assumed by the county, and even there a 
county is sometimes divided. 

In Type 2, the county exercises all significant functions except 
n the villages, which are given complete or large independence. 
An Alabama town of one thousand or more may vote to become 
independent of the county. A graded ecommon-school district to be 
independent, in Kentucky, must have at least 75 census pupils and 
maintain schools of a specified standard. A town of five hundred 
in Virginia may perform certain functions subject to the county 
board, and, in Florida, a district of any size may levy a special tax, 
thus securing some degree of independence. 








a city or incorporated town. In 1922 there were 322 city schools in which were 
employed about one-third of all the teachers and pupils in the state. 

*Only one county is organized under this law. The other counties should 
be classified under Type 4. 

* Although the law has been on the statute books for a number of years 
no county has yet accepted its provisions. Actually the state should be classi- 
fied under Type 4. 

* Three counties organized under this law. Other counties belong to Type 4. 
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Types 1 and 2 cover the states ordinarily included in the county 
units of the strong type. In Type 3, the county has considerable 
authority but divides responsibility with constituent districts and 
so is often called the ‘‘semicounty unit’’ or the ‘‘ weak county unit.” 
Under Type 4, the county has some authority, but the chief authority 
rests with the constituent districts. In some states in this group, 
clerical duties and the supervision of the small schools constitute 
practically the only responsibilities of the county. The outline just 
given summarizes the important facts regarding this classification. 
If one reads the footnotes attached to the items of this outline, one 
will realize how varied are the types of organization which prevail 
in some of the states. 

Classification in almost any field results in some overlapping. 
This is certainly the situation here. Florida might be classed as 
Type 1, since even the special-tax districts are under the general 
supervision and direction of the county superintendent and the 
county board. However, because there is a possibility for this type 
of district to exercise a considerable degree of control, especially 
in the nomination of teachers and in making recommendations re- 
garding buildings, budget, and the like, it is placed in Type 2. The 
situation in Virginia is somewhat similar regarding towns of more 
than five hundred population. South Carolina presents a peculiar 
situation because, while legally the county board has supervision 
over all districts, the local trustees actually do almost as they please. 
There are about thirty special school districts created by the legis- 
lature over which the counties have no control. 

A eareful analysis of these four types of control reveals some 
significant differences. Types 1 and 2 constitute cases where the 
county is considered as the local unit. This is because the entire 
county, except where some of the larger villages and cities are 
exempted, exercises those forms of control ordinarily expected of 
a local unit. In Type 3 and Type 4, however, the county is divided 
into smaller legal units which exercise many of these major func- 
tions. The county still has, in these two forms, some authority and 
responsibility, but beeause it functions between a local unit and 
the state, it is called an intermediate unit. 

One very significant difference between county control of Type | 
and that of Type 2 should be noted. In the former, there is no 
lesser unit than the county, except in the case of the larger cities, 
with a significant degree of control while, in the latter, the larger 
villages—some are, in fact, quite small—are partly or largely inde- 
pendent. In the latter situation, there is a separation of a village 
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and its contributing rural territory. Any one familiar with rural 
life realizes the important and intimate relations existing between a 
village and the adjacent rural territory. The farmer goes to the 
village for his groceries, hardware, dry goods, and other supplies. 
He does his banking there. He often attends church, grange, or 
fraternal organizations in this community center. He takes the 
train and gets his freight and express there. Often he receives 
eredit from the village merchant that is nowhere else available to 
him. On the other hand, the merchant, banker, doctor, and minister 
find that the farmer is contributing a large part of the patronage. 

The extent to which village and farm people have inter-relation- 
ships may be shown in a more objective manner through data col- 
lected by J. H. Kolb in Dane County, Wisconsin.® The 129 service 
agencies devoted to merchandising received from the farmers 75.6 
percent of their business as measured in dollars. More than 64 
percent of the customers, in terms of families, were from the farm. 
At the bank, 49.7 percent of the savings and 70.4 percent of the 
certificates of deposit were held by farmers. The country furnished 
52.1 percent of the high-school pupils and board members, 48 per- 
cent of the church members, and 40 percent of the officers and mem- 
bers of social and fraternal organizations. 

When such intimate relations exist between town and open 
country in social and economic activities, it seems unwise to set up 
harriers to cooperation for educational purposes. Yet this is exactly 
what happens when the smaller villages are allowed to become inde- 
pendent school units without regard to the outlying territory. 
County control of Type 2 tends, therefore, to bring together for 
school purposes not the village and its contributing areas but that 
territory comprising the more sparsely populated sections of the 
‘ounty. 

These sections have little in common aside from the fact that 
educational, social, and economic conditions are somewhat similar. 
The people from areas of this type in one part of the county often 
have few contacts with those on the opposite side of the county, 
and thus little opportunity exists for the development of a genuine 
group spirit. Their natural contacts are with their local ecommu- 
nity centers, usually in or about some village. As the village de- 
velops into a small city the surrounding rural districts have less 


* Kolb, J. H. Service Relations of Town and Country. Madison, Wisconsin, 
Wisconsin Agriculture Experiment Station, 1923. pp. 22, 23, 30, 33. (Uni- 
versity of Wisconsin Research Bulletin, No. 58) 
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influence in the affairs of the municipality, thus tending to set off 
their own particular interests from these of the city. Because of 
this, there is reason in the view that a municipality dominated by 
urban ideals may well be separated from the rural environs for pur. 
poses of educational control. In the smaller municipalities, the same 
reason for separation does not exist, at least not in the same degree. 

In most of the states of Type 4, it would appear that the con- 
stituent districts are attempting to perform some duties which they 
are not able to do well, and that educational effectiveness would be 
promoted if they were transferred to the county. This by no means 
warrants the conclusion that all control should be so transferred. 

An attempted evaluation of these four types of county control 
is not within the scope of this article. The writer’s judgment is 
that the type to be preferred will vary among the states according 
to conditions and that even within a single state it may be desirable 
to have more than one of these types. The important consideration 
is not to seek the establishment of one type or another, but rather 
to set forth what the objectives of a good local school unit are and 
then to provide that type of unit which most nearly approaches the 
accomplishment of desired results." If we elarify our terminolog) 
regarding county control, we shall have made a real step toward 
better thinking on the problem. 


In Educational Administration and Supervision, 11:145-56, March, 1925, 
The writer of this article has set down his analysis of these objectives. 
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THE PREDICTIVE VALUE OF CERTAIN MEASURES 
OF ABILITY IN COLLEGE FRESHMEN 


GLEN U,. CLEETON 
Carnegie Institute of Technology 


Tuer experiments which are described in this report deal with 
problems related to the predictive value of certain measures of de- 
velopmental capacity and measures of attainment. College Fresh- 
men were used as subjects. Intelligence-examination scores, scores 
on the High-School Content Examination, and scholarship scores 
were used as indexes of ability and attainment. 


PREDICTIVE VALUE OF INTELLIGENCE-EXAMINATION SCORES 


The Thorndike Intelligence Examination was administered to 
several groups of college Freshmen enrolled in pre-engineering 
courses in September, 1923; September, 1924; and September, 
1925. The first question usually asked by administrative officers in 
connection with experiments, such as we have under consideration, 
is, ‘‘How well does an intelligence examination predict scholastic 
ability?’’ In order to answer such a question, some measure of 
school achievement must always be obtained. 

For our purposes, ‘‘quality points of scholarship’’ were deter- 
mined and used as scholarship scores. Quality points of scholarship 
are computed by multiplying the number of units earned by a 
designated weight for grades earned. A class meeting three hours 
per week earns 9 units; two hours a week, 6 units; one hour per 
week, 3 units, ete. Teachers’ marks are rated as follows: A earns 
6 points; B, 5 points; C, 4 points; D, 3 points; E, 2 points; F, 1 
point; and R, 0. A more accurate equating of teachers’ marks 
might be obtained by careful statistical treatment, but the added 
labor seemed to more than outweigh the gain in increased com- 
parability of measures which might thus be obtained. The sum of 
he quality points of scholarship earned in all courses was used 
throughout the experiments reported as the quantitative criterion 
of scholastic achievement. The total number of these points was 
used because it was the most objective measure readily available. It 
is the established method of indicating scholastic success in the 
institution in which the experiments were conducted. 
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The reliability coefficient for the Thorndike Examination as , 
whole is about 0.85. The reliability of quality points of scholarship, 


determined by correlating marks obtained during the first semester 


with those of the second semester of the freshman year, usually ex. 
ceeds 0.80 in this institution. Further discussion of the reliability 
and signifieanee of teachers’ marks as criteria is offered in a later 
section of this report. 

Correlations obtained in our experiments between Thorndike 
scores and the average of points on scholarship were usually around 


TABLE I 


CORRELATIONS BETWEEN SCHOLARSHIP 
AND THE THORNDIKE SCORES 





Correlation 


Number of Coefficient Probable 
Students Error 
(r) 
i .. @ ; «3 


Freshman Year 





239 519 | .032 
240 cane 472 | .033 
2a0... oe .482 .034 
25 


7 521 | .030 


Sophomore Year 


| 
165... j 378 | = .045 
t 


95 .442 055 
165 .423 043 





0.50. Those given in Table I are representative for the freshman 
and sophomore years, and for the two years combined. Highe 
correlations than those given in this table have been reported by 
other persons working with the Thorndike Examination.’ 

It is customary in the institution where the experiments* were 


‘Wood, Ben D. Measurement in Higher Education. Yonkers, New York, 
World Book Company, 1923. p. 77ff. 

*, The experiments here reported are part of a series now in progress. Tl 
theoretical problems related to the experiments are diseussed in an article )) 
the writer, ‘‘ Meeting the Need for Improved Measures to Be Used in the 
College Guidance Program,’’ Educational Administration and Superviswn, 
¥: 489-94, October, 1925. 
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Ws 


TABLE II 
CompPARATIVE DISTRIBUTION OF ONE THOUSAND CASES ON Basis OF THORNDIKE 
CLASSES AND QUALITY-PoINT CLASSES 


Quality-Point Class 
Classes Determined by | 
Thorndike Scores 


| R } I I D ¢ B | A 
l ae (2) re Ti $) ai iyi (4 4) o) ; z 7) 7 | ~~ (8) 
\ 4 3 | 6 11 42 | 29 | 23 
B ' g | 6| 9 19 96 25 | 12 
C | 25 | 31 |] #50] 69 51 | 16] 2 
D | 32 ao; 41 / 52 | 42] 4] 2 
| 24 | 19 | 20 18 17 1 | 
; | 7 | 5 6 3 | 1 | 
R 9/ 3] 2 3 1 | 


conducted to group students into ‘‘quality-point classes.’’ The 
scheme of classification, based upon class marks, is: 


Quality 

Group Points 
ee eee 
I Per re ee 245 — 269 
skte koewes ends ocawen 175 — 244 
—_ ee eee ts se 150-174 
Ae ee 125 — 149 
. és: <) eaeepawiaanedes cae ee 
Bis sis orenceuG smaae ces 0-— 99 


For administrative purposes, an attempt was made to parallel 
the scholarship quality-point classes in terms of Thorndike scores 
when the examination was first introduced. The arbitrary classi- 
fication used was: 


Thorndike 

Group Scores 

Bis <.s Deena kerake ed 96 or over 

a ts éthaaaik enema wes 86-95 

GPs -s-6 & eeennewexnasusnes 71-85 

dé ok Cakaseecdmetwesten 61-70 

iG kh ae che whee em eeeae ae 51-60 

PF os sdenckatuannwes esiee 41 —50 

a deddavheenusunnd vor below 40 


The comparative distribution, of one thousand persons in each 
of these two classes, is shown in Table II. A study of this table 
will show that prediction within very restricted limits is not highly 
suecessful when based upon the present relationship between Thorn- 
dike seores and scholarship points. For instance, the fact that a 
person rates A on the Thorndike Examination does not signify in 
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more than about one case in five that such a person will likewise 
finish the freshman year with a rating of A in scholarship. If the 
limits of prediction are widened, the significance of the Thorndike 
score becomes greater. A rating of A on the Thorndike Examina- 
tion, for instance, signifies that persons so rated will finish the 
freshman year with a scholarship rating of average, or above, in 
about four cases in five. 

If the record shown in Table II may be taken as likely to repeat 
itself, certain probabilities, that a student in a given Thorndike. 
score class will finish the freshman year with a scholarship record 
of average, or above, may be set up. Just what these probabilities 
are may be listed in terms of ratios. The probabilities that stu- 
dents in each Thorndike-seore class will earn scholarship ratings 
of average or above are shown by the following ratios: 

Thorndike- . 


Score Probability 
Ratio 


1 


Thorndike suggests that ‘‘persons scoring under 60, who are 
seventeen years old or older, are as a rule unsuitable material for 
college education.’’ The results of an analysis of the scholarship 
records, at the end of the year, of 150 such cases, found among the 
one thousand students, who earned 60 or less on the Thorndike 
Examination may be summarized as follows: 

Rated above average 
Rated average 


Rated below average 
Rated unsatisfactory . 


Such figures as these indicate strongly that no serious harm 
would be done if regularly trained high-school graduates were sub 
jected to the 60-score limitation. It must be kept in mind, however, 
that a certain number who earn 60, or below, will attain an average 
standing in scholarship. 


*It should be noted that the five persons in Thorndike classes ‘‘F’’ and 
**R’?’ who earn average marks or above, are students who entered the institution 
with a language handicap. 

*In the first two groups there were 9 students who knew a foreign languag' 
as a mother tongue, reducing the net number to 14 in 150 cases. 
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In administering the Thorndike Examination, the question sug- 
gests itself as to whether the examination as a whole has more value 
gs a predictive measure than its separate parts. The answer to this 
question, in so far as our results will indicate such an answer, is 
shown in Table III. The correlations given are considered typical. 

Later in this report, it will be shown that a correlation approxi- 
mating that secured between the total Thorndike Examination and 
scholarship, covering two hours and forty minutes of net working 


TABLE III 
THORNDIKE Scores CORRELATED WITH SCHOLARSHIP 





; ic Time 
Thorndike Correlation | probable t Used 


Examination (r) Error ‘ (Minutes) 





(1) (2) (3) (4) (5) 





Part I (One form)... . .398 .036 30 
Part I (Two forms) . . 389 .036 60 
Part Il .391 .036 60 
Past E38. ....... | .416 | = .036 40 
Part I plus Part Il...| .438 .035 90 

















TABLE IV 


RELIABILITY OF QUALITY-POINT 
ScHOLARSHIP RATINGS FOR 
FRESHMAN STUDENTS 








Number of Correlation Probable 
Students — ient Error 
r 


oo | oo . op 

an — |——> —-_s 
ASS .823 .014 
281... as .824 .013 
147 pres be .017 


156 eee | ae 





time, has been secured by selecting certain tests from Part I using 
only seventeen and one-half minutes of net working time. 


SCHOLARSHIP RECORDS AS CRITERIA OF SUCCESS 


The problems of a marking system and the use of more exact 
measures of estimate have received careful attention in the institu- 
tion in which the experiments herein reported were conducted. As 
i result there is a fair degree of consistency in marks. The relia- 
bility coefficients shown in Table IV are representative of those 
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obtained by correlating first-semester marks with second-semester 
marks for freshmen students. 

First-year scholarship quality points do not predict second-year 
scholarship scores with the same degree of accuracy as obtains be- 
tween first- and second-semester marks in the freshman year. The 
coefficients of correlation obtained between first-year scholarship 
scores and those of the second year are around .70; .697 in one 
instance with 165 cases and .727 in another instance with 95 cases, 


PREDICTIVE VALUE OF CONTENT-EXAMINATION SCORES 


It has long been held that knowledge of high-school subject- 
matter is essential to college success. It is certainly logical to 


TABLE V 


CoMPARATIVE DISTRIBUTION OF ONE THOUSAND STUDENTS IN SCHOLAR- 
sHip Quauiry-Pornt CLasses BASED ON THE GRADES OF THE 
FIRST AND SECOND SEMESTER 


| . . . . 
Second-Semester Scholarship Classes 


First-Semester - 

















Scholarship Classes R F E | D | Cc | B | A 
— @ |@{|@i!|@il ol] @l_@m i! ® 
| | 
A.. | | ..| 10) UM] 41 
B | | 3] 4) 2] a] 8 
C | 7] 9] 21] 63 | 231 | 27 | l 
D | 9] 14] 31] 69] 43 1 
E | 19] 23] 58] 24] 17] 11 
F 25 | 34 19 | il 7 | 
R 66) 19) WW} 7] 5] | 


assume that mental content of a certain type is prerequisite to the 
pursuit of college courses, and, in a way, indicative of ability to 
profit by college training. In seeking to determine the predictive 
value of the informational type of questions covering high-school 
subject-matter, the Iowa High-School Content Examination was 
administered to a group of college Freshmen in September, 1924 
This examination requires eighty minutes and consists of four sec- 
tions, one each of English, mathematies, science, and history in- 
cluding social-science questions. The reliability of this examination 
is reported as .949.° 
* For data and sample of the examination, see Ruch, G. M. Improveme 


of the Written Examination. Chicago, Scott, Foresman and Company, 1924 
p. 153. 
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The correlations obtained between freshman scholarship scores 
and seores on the content examination are shown in Table VI. 


[his table speaks for itself so clearly that comment on the co- 
efficients cited will be withheld until a later section of the report. 


INCREASING ACCURACY OF PREDICTION BY COMBINING SCORES 


Seores on Parts I and II of the Thorndike Intelligence Exam- 
nation were available for persons who took the lowa Content Ex- 
amination in 1924. Considered separately, neither the intelligence- 
examination scores nor content-examination scores shows correla- 
tions with scholarship seores noticeably higher than was obtained by 


TABLE VI 


CORRELATIONS BETWEEN CONTENT EXAMINATION SCORES AND SCHOLARSHIP 
PoInt SCORES 


Correlation | Number 


> t > 
lowa Examination Coefficients | Probable 
Error 
} 


“ 


Entire examination. .489 0: 80 
Entire examination 495 .028 3: 80 
Er 


glish section. .... .316 | .O 28 20 
English section ; 344 03% 3: 20 
Mathematies section .424 03: 28 20 
Mathematics section .469 26 3: 20 
Science section. . . , .442 03% 20 
Science section. . . :; .436 0 3: 20 
History including social science ; .330 .036 : 20 
History including social science 341 . 03: 32 20 
Mathematics plus science ol.) . O26 28 40 
Mathematics plus science | .535 
English plus history 346 
English plus history |} .369 











using the intelligence examination alone as a predictive instrument. 
However, by combining scores obtained on the intelligence examina- 
tion with those on the content examination, it was possible to in- 
crease the correlation with scholarship to about 0.60, 

When the records of 283 students are used, a rough approxima- 
tion of weightings for the combined scores yields a correlation of 
48, PE + .027, between scholarship quality-point scores and the 
sum of the weighted scores for the Thorndike Examination and 
content examination.® 


*The method used in determining the weighted scores is described at the 
end of this article. 
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In order to determine the best possible combination of scores to 
estimate scholarship, multiple-correlation technique was _intro- 
duced. Kelley’s formulas were used.? The intercorrelations for 
the records of 315 students were :* 


Intelligence examination (Part I plus Part II) with content ex- 


QUREMASIER 6 oe cescccccececscesocccccceccecccesoccescecee 0.782 + .015 
Intelligence examination with scholarship points................ 0.434 + .030 
Content examination with scholarship points...............+00+: 0.511 + .028 
Intelligence plus content examination with scholarship points....0.605 + .024 


Possibilities for further increase in accuracy of prediction.—The 
correlation coefficients of .60 to .65, obtained by careful weighting 
of scores, seem to indicate the ultimate efficiency of prediction which 
we may hope to attain between present measuring devices and other 
indexes of success. If we are to increase the predictive accuracy of 
measures, it seems that future prospecting must follow other lines. 
Three possibilities which readily suggest themselves are: (1) to 
secure other criteria of success to supplement scholarship records, 
(2) to supplement measuring devices of capacity now in use with 
radically different measures, and (3) to improve the predictive value 
of existing measuring devices of capacity or to prepare measures 
of the same genera! nature as those now in use but of a higher degree 
of differential sensitivity. 

The correlation of each separate part of the Thorndike Exam- 
ination with scholarship seores is shown in Table III. On further 
analysis, it was found that certain tests in the Thorndike Examina- 
tion contributed little or nothing to the predictive value of the whole 
examination in our experiments. 

Quartile tabulations of scores on each of the tests in Part I of 
the Thorndike Examination against scholarship scores were made. 
These quartile tabulatjons showed that the following tests were not 
functioning as predictive measures to any degree significantly better 
than chance: 


Number Time 
of Items (Minutes) 

Test 1 (Following Directions) ....... 5 1% 
Terk. F CHE ROE) os 0 vctervecsven & 2 
Test 10 (Number Checking) ......... 9 2 
Te Be LED bcs cecacacceass 11 2 
Test 1B CPEB) 66-5 vccccccvccese 9 2 
Test 13 (Identification) ............. 20 4 


‘See Kelley, T. L. Statistical Method. New York, Macmillan Company, 
1923. pp. 279-310. 

*With another group of 156 students taking a somewhat different fresh- 
man course, a correlation of 0.644 between combined weighted scores and 
scholarship points was obtained. 
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Since there are only a few items in each of the tests named, it 
was not possible to determine the reliability of each single test. A 
score was computed for all tests in the list by taking the sum of 
the scores on all tests as a total. The reliability coefficient for these 
tests combined, computed by correlating the scores of 240 students 
on the two similar forms, was found to be .509 + .032. For pre- 
dictive purposes in our experiments, the tests just named represent 
thirteen and a half minutes of net working time wasted for each 
form of Part I, making a total of twenty-seven minutes in the whole 
examination. The correlation of the combined scores on these tests 
with scholarship was .089 + .038. 

Quartile tabulations showed that the following tests in Part I 
of the Thorndike Examination had merit for predicting scholarship : 

Number Time 
of Items (Minutes) 
(Disarranged Sentences) ...... 10 2% 
Test 3 (Arithmetical Operations) ..... 8 
Test (Arithmetical Reasoning) ..... 10 


Test 2 
4 

Test 5 (Information)... 10 
6 
8 


Test (Oppowites) . oo cess ccccccesses 20 
Test 8 (Number Completion) ........ 10 
Tees BD CAME) «6s: waccccenscnccs 20 

The reliability coefficient for scores secured by summing the 
right responses on these tests was .759 + .018. The correlation 
between scores from these selected tests and freshman scholarship 
scores was found to be .477 + .034. This correlation between se- 
lected tests and scholarship is more significant if considered from 
the point of view of the time required. The tests used represent a 
net working time of seventeen and one-half minutes. The correla- 
tion is higher than for Part I as a whole, which is usually around 
0.40, and which represents a net working time of thirty minutes. It 
is higher than for two forms of Part I representing one hour of net 
working time, and it is also higher than for summed scores of Part I 
and Part II on which the time is one hour and thirty minutes. This 
correlation compares favorably with those obtained for the whole 
examination which requires two hours and forty minutes net work- 
ing time. Analyses of other parts of the Thorndike Examination 
reveal similar facts relative to reliability of different tests and their 
predictive merit. 

Correlations between scholarship and separate tests in the Iowa 
(‘ontent Examination are given in Table VI. The most outstanding 
faet shown by this table is that the score obtained by summing the 
mathematies and science sections gives a higher coefficient when 
correlated with scholarship than does the total examination score. 
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Analysis of the Thorndike Examination suggests that there is 
serious need of attention to the question of reliability. Carefy| 
examination of the responses to the tests in the group having a 
reliability of 0.501 makes it possible to offer the following conclu. 
sions as to the reasons for unreliability: (1) Tests are too brief, 
Many persons could finish more items in the time allowed than are 
supplied. (2) All items are often quite easy. In twenty-four hun- 
dred responses to one test, for instance, there are less than 2 per- 
cent of errors.® (3) The items of the test are arranged apparently 
without regard for relative degrees of difficulty. (4) The steps 
from item to item are often broad. (5) The form of the test is at 
times cumbersome and confusing." 


SUMMARY 


1. The reliability coefficients of the measures used in the study 
herein reported are approximately: Thorndike Intelligence Exam. 
ination, .85; Iowa High-School Content Examination, .95; and 
scholarship quality points, .80. 

2. Measures of capacity used predict measures of success in a 
one-year pre-engineering course as follows: Thorndike Intelligence 
Examination, .50; and Iowa High School Content Examination, .50. 

3. The ultimate efficiency of prediction, likely to result from the 
use of multiple-correlation technique in weighting and combining 
Thorndike Scores and scores of the Iowa High-School Content Ex- 
amination, seems to lie somewhere between .60 and .65 using these 
examinations in their original form. 

4. A coefficient of .50 is not highly significant if close limits 
of prediction are demanded. In predicting the probabilities of a 
student attaining a scholarship rank either above or below average, 
that is, in using two eategories of attainment for purposes of pre- 
diction, such a coefficient has considerable practical significance. 
Chances of attainment may be specifically expressed in terms of 
ratios. 

5. A lower critical score, considered in the light of probabilities 
of a student attaining a satisfactory achievement standing, has sig- 


* For data bearing on the problem of relation between the degree of diffi 
culty and predictive value of test items see an article by the author, ‘‘ Optimum 
Difficulty of Group Test Items,’’ Journal of Applied Psychology, 10:327-40, 
September, 1926. 

” Test 12 in Part I of the Thorndike Examination is an example of a test 
which causes the subject confusion because the arrangement of items is new 
to most persons taking the test. 
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nificance. However, a lower critical score does not warrant predic- 
tions which must be arbitrarily inclusive or exclusive. 

6. Measures which predict probable success in first-year pre- 
engineering courses are not equally significant, in the instance of 
the present studies, in their prediction of success in later specialized 
courses, 

7. Part I, Part II, or Part III of the Thorndike Examination 
do not predict suecess in freshman courses as accurately as does the 
total examination. However, by selecting certain tests contained 
in Part I, it is possible to secure a combination of tests which pre- 
diet success with greater accuracy than does Part I in its entirety, 
or, for that matter, with greater accuracy than does the sum of two 
forms of Part I, or the sum of Part I and Part II. This represents 
superior prediction obtainable with a group of tests requiring 
seventeen and a half minutes of net working time as compared with 
ther tests requiring thirty minutes, sixty minutes, and one hundred 
minutes respectively for administration. The selected tests covering 
seventeen and a half minutes of time used predict achievement 
almost as efficiently as does the whole examination requiring three 
hours working time. The test items in the group requiring seven- 
teen and a half minutes fall readily into three classes, that is, those 
lealing with word relations, those dealing with number relations, 
and information questions. 

8. Revision of the Thorndike Examination with careful atten- 
tion being given to form and questions of ealibration would quite 
likely bring a profitable inerease in reliability of the examination. 

9. The sum of the mathematics and science scores of the lowa 
High-School Content Examination, requiring a net working time 
f forty minutes, predicts success in a first-year pre-engineering 
ourse with greater efficiency than does the total Thorndike Exam- 
nation score, or the total Iowa Content score; the latter two repre- 
enting three hours and eighty minutes net working time, re- 
spectively. 

10. The improvement of existing measuring devices, or pro- 
vision of devices of a similar nature, but superior in predictive effi- 
ciency, may be best attained by establishing greater perfection in 
form and objectivity of measures; by solving of problems of 
calibration as related to reliability and predictive merit; and, ulti- 
mately, by the determination of the most sensitive test items, both 
of the general intelligence-test type and of the content type, for 
inclusion in future tests to be used for predictive purposes. 






























368 JOURNAL OF EDUCATIONAL RESEARCH [Pol. 15, No.5 


ADDITIONAL DATA 
METHOD OF WEIGHTING 


The method of approximate weighting scores was developed as follows:" 


M, = mean of scholarship point scores, 

M, = mean of Thorndike scores, 

M, = mean of content scores, 

X, = gross scholarship score, 

X, = gross Thorndike score, 

X, = gross content score, and 

N = number of independent variables. 
Then M, ot a We 

M, 
and M, eVJ= W. 

M, 


and finally X, — (X,-W,) + (X,-W,). 


FURTHER INCREASE IN THE ACCURACY OF PREDICTION 


If it is true that critical selection of tests to be used and that weighting 
of the various seores on the tests will serve to increase the predictive value of 
examinations or scales of tests, it seems reasonable to expect that careful selec 
tion of the items within tests will increase the predictive value of the tests. 
While this is a reasonable assumption, it is one on which information is lacking 
in the literature on test making. 

In seeking information on the proposition just stated, certain experiments 
were undertaken for the purpose of determining the value of carefully selected 
test items as against unselected test items. The term, ‘‘selected items,’’ as 
here used means test items which have shown by past performance that the: 
have individual value as predictive measures. 

To get an indication of the value of single items of the test, four groups 
of persons were chosen as representative of four categories of ability. These 
four categories of ability were assumed to be represented by the quartiles of 
distribution of scholarship scores and covering what might readily be named as 
the superior group, above-average group, below-average group, and inferior 
group. <A large number of persons were selected as representative in each 
group, first, on the basis of their scholarship records. Those whose Thorndike 
score was more than one of the four categories away from the scholarship rank 
were rejected. Further eliminations were made of those whose Iowa Content 
Examination scores varied more than one of the four categories away from 
scholarship rank. By chance elimination, it was then possible to secure 60 
persons for each of the four control groups named. 

After the control groups had been set up, the responses to each item in 
two forms of Part I, which had been administered to freshman students, were 


" Extended use of this formula necessitates that allowance be made for the corre 
lation of the independent variables separately with the dependent variable and for the 
sigma of each variable. 
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abulated for each of the 60 persons in each of the four control groups. A 
lar tabulation was made of responses to all items in the Iowa High-School 
ntent Examination. 

Where there was a consistent increase in the number of right responses 
om the inferior group through the four categories of ability to the superior 
roup, the test item was considered to have predictive value. If a parallel 
etween right responses and ability categories did not exist, the test item was 

ected. In some instances all persons taking the tests did not reach a given 
before time for finishing was exhausted. In such cases the probable num- 
of right responses was estimated in proportion to the number of right re- 
onses versus wrong responses. Omissions were counted as wrong responses. 

The following tabulations are illustrative of the method. The question or 

tem which showed the following distribution of responses was considered 


significant: 


Right Wrong 
Responses Responses 
59 


Superior group . ees 1 
Above average... 12 
Below average... 19 
Inferior ... 26 34 


his distribution is typical, but the division between groups was not always as 
stinct with some of the items accepted, while, in case of other items, it was 
eater. 

Questions or items which did not parallel the four categories were rejected. 
he following are sample distributions of rejected items; an inspection of the 
stributions will indicate the reason: 

Right Wrong 
Responses Responses 
Superior group... 
Above average... 3 
Below average .. . 9 
Inferior ... 9 


Superior... 
Above average... 
Below average . . . 
Inferior . . 


Ort «3 tn 


As the experiment progressed, it was found that two forms of Part I 
f Thorndike Examination did not furnish a sufficient number of test items 
) that it was necessary to get control groups from previous years which had 
sed other forms of Part I. Selection of items was made in the same manner, 
it high-school standing was substituted for scores on the content examination 
vhich were not available. Additional items were also secured from the Alpha 
“xamination by using the same procedure. A series of experimental tests were 
onstructed from the items thus selected. 

The experimental tests thus prepared were administered to a group 
of college Freshmen in September, 1925. As soon as scholarship records are 
made available, it will be possible to determine whether the tests containing 
selected items have a greater predictive value than tests which contain unselected 


items which were also used. 
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The tests listed in the following groups are being tried out at present. It 
is expected that others will be added as further check makes additional select, 1 
items available. 


Group A 


Number 
of Items 
ee ee ae 75 on 
Mathematics Information ...........e.eeeeeees 50 rH 
EEE ES eee ne 50 ’ 
History—Social-Science Information ........... 75" ament 
the lig 
Group B contro 
Number ject ive 
of Items Ce 
Confused Sentence (True-false)................ 30 opin10ol 
Confused Sentence (True-false, incomplete)... . 30 , 
Confused Sentence (Completion)................ 30 tion O 
5's 4465-66440 Gb0 oR ane trek eees decane éax 50 ‘ootive 
Antonym-Synonym (Multiple choice)............ 50 jective 
i MS 6s 6 nd ao Sak w.es bs .06.00 edu 35 
mrememer Bertes Completiom. 2... cccivccciccececs 30 
Pee DE. san deeG cteehabhcewes ees 25 
AvuTHor’s Nore: Sinee the preparation of the original manuscript f 
this article, test items from various sources have been added to ‘‘Group B’’ — 
listed above and the whole incorporated in a booklet, Tests of General M: 
tal Ability: Group Examination, published by the Department of Educat 
and Psychology, Carnegie Institute of Technology, Pittsburgh. The exp 
mental tests which are now included in ‘‘Group B”’’ are: 
Number 
of Items 
Word Relations Tests — 
Test 1. Confused Sentence (True-false-incomplete) 45 oe 
Test 2. Analogies WRTrrrirr ir Teer e Pee 45 . 
Test 3. Antonym-Synonym (Multiple choice).... 45 J Pu 
Test 4. Definitions (Matching) ... ..... Hl. Ce 
Number Relations Tests Ill. In 
Test 5. Number Series Completion............. 30 . 
Test 6. Arithmetic Operations ... ......... .. 40 a 
Oe a SED SEE so @ éseewnesboeus 35 
Information Tests , 
Test 8. High-School Information (Multiple choice) 50 labors 
Test 9. High-School Information (True-false).... 50 k 
a o TOOK ¢ 
“Tests in Group ‘‘A*’ comprise the Iowa High-School Content Examination school 
Form A-l This test may be secured at a cost of 8 cents each from the Extension 
Division, State University of Iowa, lowa City, Iowa be "ea 
struct 
” 
F\ 
select 
grou] 
1925, 
til th 
sible 
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A STUDY OF THE RELATIVE VALUE OF THREE 
METHODS OF TEACHING HIGH-SCHOOL 
CHEMISTRY 
H. B. NASH AND M. J. W. PHILLIPS 
West Allis Public Schools 

Tue fact that classroom technique is under fire need not be 

mented; it is ever desirable that we evaluate our methods in 
the light of all possible evidence. Personal opinion has too long 
controlled our school procedure. With the development of ob- 
jective tests, however, we need no longer depend upon personal 
opinion. Experimental methods which seek an impersonal evalua- 
tion of procedure must supplement, if not supplant, the old sub- 
jective methods. Because the classroom itself provides the best 

TABLE I 

MENTAL ABILITY OF THE THREE GROUPS 


CHRONOLOGICAL 
AGE 


INTELLIGENCE 


MENTAL AGE | QUOTIENT 





|Standard 


/— 


Mean 


(Years, 


| Deviation 
(Years, 


Mean 


iStandard 
| Deviation 


Mean 
(Years, 


Standard 
Deviation 


(Months) 
(7) 

: 8 

12 


7 


Months) 


(6) 


Months) 


Months) 





Ss B 
102 
103 5. 

101 5 | 


(3) 


(1) (2) | 


| | 

| | 
17-5 1-0 | 
17-6 1-1 
17-5 1 


I. Pupil Group 
II. Combination Group 
III. Instructor Group. . 





1 
1 
1 














laboratory for studying methods of teaching, the writers under- 
took an experimental evaluation of three methods of teaching high- 
school chemistry. The three methods used in this experiment may 
be ealled the pupil method, the combination method, and the in- 
structor method. Each of these methods will be described later. 

For purposes of comparison it was first necessary, of course, to 
select three groups of students of comparatively equal ability. These 
groups were selected at the beginning of the second semester of 
1925, and changes were made in the composition of the groups un- 
til they were as nearly identical in mental capacity as it was pos- 
sible to make them. Figure 1 and Table I show the relative ability 
of these three groups as measured by the Miller Mental Ability 
Test, Form A. 
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0.93 


tw 


[t will be seen from both the table and the graph that the groups 
were similar as to ability. The class taught by the pupil method 
had eighteen pupils while the other two classes had twenty-two 
pupils each. In each case, only fifteen pupils were considered jn 
this study as these were the only ones who could be matched as 





— Pupil 
—— Instructor | 
...- Combination 





Z, Fis 


(A o -{70 














T T T T if | | i 
O 10 20 30 40 50 60 70 80 90 100 
Figure 1. Similarity of groups based on mental ages 


to ability. The writers realize that the groups were small and that 
larger groups may yield somewhat different results. It is believed, 
however, that, since the groups were exposed to the methods in 
question for a five-month period, the results are suggestive and 
significant though, of course, in no sense final. 

The test used as a measure of progress was arranged by one 
of the writers of this article, who was also the instructor of the 
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three groups in question, and it consisted of the following three 
narts: Part I, true-and-false type, 60 items, 30 true and 30 false; 
Part II, completion type, 38 items made up of statements with 
missing items to be furnished by the pupil; Part III, multiple- 
choice type with three choices, 25 items. The test constituted ma- 
terial drawn from various sources, such as chemistry tests devised 
by prominent investigators! of high-school chemistry and materials 
from various textbooks—inecluding the one which the classes were 
using.2 The material had gone through two revisions with the 
purpose of making the test a fairly reliable measure of the range 
of information that should be aequired by a pupil during the sec- 

d semester of study. 

The subject-matter ranged from sulphur—a non-metal—through 
metals, and involved a study of their properties, reactions toward 
other elements, tests to show the presence of the various metals, and 
practical and commercial applications of chemical science in the 
environment of the pupil. Sinee the three groups had already 
a semester’s work in chemistry, they understood the nomen- 

ature, the writing of equations, the working of problems involv- 
ng molecular weights, gas volumes, temperature conversion, and 
e metric system. They also had had laboratory work in the 
preparation of the principal non-metals, common acids, and some 


THREE METHODS DESCRIBED 


Pupil method.—The members of the group taught by the so- 


lled ‘pupil method’’ were allowed to study as they wished. Each 
ndividual studied as rapidly as he felt inclined to, worked experi- 
ments when he wished to, felt free to do as much as he could do, 


nd was encouraged to ask questions whenever help or advice was 

eded. As soon as anyone had completed the work in the text 
nd laboratory manual, topies of his own selection were chosen for 
vhich additional credit was given. The first pupil to complete the 

rular work finished near the middle of the semester; some com- 
pleted only between two-thirds and three-fourths of the regular 
work by the end of the semester. The materials in the textbook 


Glenn, Earl R., and Welton, Louis E. New Types of High-School Chem 
stry Tests for Instructional Purposes; Powers, 8. R., Tests for General 
Chemistry; Rivett, B. J., Time Limit Tests in Chemistry; Bull, J. Carleton, 
fests in First-Year Chemistry. : 

* Brownlee, Raymond Bedell, and Others. Elementary Principles of Chem- 


istry. Boston, Allyn and Bacon, 1921. ix + 588 pp. 
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were outlined in units for the class, and the pupils were given 
little assistance by the instructor. Soon after the course began. 
each member of the group was working on a different topic. Some 
asked for help often, others never. The instructor did not offer 
help to the pupil who did not request it. He sought to depend 
upon the individual instruction of a pupil, set him on the righ 
track, and let him go. 

No drill was used, nor was it possible to test a pupil’s progress 
from time to time, because it would have required giving fifteen 
different tests at one time. As is obvious, this method of instrue- 
tion is by no means easy, for each pupil is studying something dif- 
ferent and in a different way. 

Combination method.—The second method called, for purposes 
of experiment, the ‘‘combination method,’’ used demonstratioy 
lecture, laboratory work, and recitation and was the method whic! 
had been used for vears by one of the writers in his teaching of 
chemistry. It consisted of demonstrations in the lecture room. 
assisted by members of the class; individual laboratory work wit! 
simple experiments; and group laboratory work with the mor 
complicated experiments. This was followed by class discussion 
and the points not clear were discussed, reviewed, and tested. Thi 
tests were followed by some drill on the points which the class had 
found difficult. This method, in some form or other, is most used 
in teaching chemistry today, and is the one with which most teach 
ers are familiar. 

This group might be considered a control group. The materia 
in the text was covered and a thorough review was held before th 
final test. No extra work was attempted nor projects suggested 

Instructor method.—In the procedure used with the third grou 
ealled here the ‘‘instructor method,’’ units of work were assigned 
from the text and the pupils were never called upon to recite. 
check was made to determine whether the lessons assigned were 
prepared or not. The pupils did not come in personal contact wit! 
any experimental materials—they simply came to the classroom 
where the instructor gave a complete demonstration or performed 
the experiment before the class. They were permitted to ask ques- 
tions of the instructor, but under no cireumstances did the in 
structor ask questions of the class members to determine whether 
or not the items of the demonstration were clear. Equations fo! 
reactions were written on the board, and problems were worked out 
for the pupil. The instructor did all the work for the class; the 
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ls took notes on what was done and occasionally questioned the 
tructor, which was the only activity of the pupil. The method 
nsisted only of ‘‘teacher talk’’ with little or no ‘‘pupil talk.’”’ 
The work of the semester was easily covered, and a number of spe- 
lectures and demonstrations were presented during the time 
emaining before the final test. This work, however, was not part 
the regular work and was not covered in the test used. 


RESULTS 
The time allowed for each group concerned in this experiment 
was forty-five minutes, the regular time allotted for recitation or 
boratory work. The same test was given to the three groups at 


TABLE II 


RESULTS OF THE INITIAL AND FINAL TESTINGS 


INrtTIAL Test Fina Test | GaIn 


|Standard| 'Standard Standard 
Mean |Deviation| Me®" |Deviation| Mean | Deviation 


(1) 2) 


2 12. 8.8 | 16.5 
5 10:2 | 47:3 | i7f.0 
l i8 52.2 | 14.0 


[. Pupil Group 2 
Il. Combination Group... 15.: 
1 


[{f. Instructor Group 8. 


the beginning of the semester’s work and again at the close. For 
brevity of reference the groups will be designated by the method 
of instruction used, that is, Pupil Group, Combination Group, and 
Instruetor Group. 

The results of the tests given to each group are shown in Table 
Il. Figure 2 gives a graphical presentation of the gains shown 
in-the table. It is obvious that, despite the differences existing 
in initial achievement, the Instructor Group has the highest final 
achievement. The adequate basis for comparison is, however, in 
terms of the average gain per group. The difference in gain be- 
tween the first and second groups is 6.7 in favor of the second; the 
difference between the first and third groups is 8.5 in favor of the 
third group; and the difference between the second and third 
groups is 1.8 in favor of the third group. Reduced to a percent 
ge basis, this means a gain of 8.2 and 1.5 for the Instructor Group 
over the Pupil Group and Combination Group, respectively. 
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The differences indicated by the average achievement are show 
to be real group differences. It is true that the upper 20 pereent 
about three people—of the Combination Group is superior to the 
upper 20 percent of the Instructor Group. For the other 80 pe 
cent, however, the latter group shows itself superior to the da 





— Pupil 
—— I[nstructer -5° 
~- -- Combination , 








I, +10 
O 10 20 30 4 0 70 80 9 

40 50 60 70 80 90 100 
Figure 2. Gain made by groups on basis of information 
test used 














group and at all points of the distribution distinctly superior to 
the Pupil Group. , 

| The results indicate that, within the limits of this experiment, 
in acquiring information, the instructor method of teaching high- 
school chemistry is superior to the pupil and combination methods 
It is true that the superiority is not very marked in the ease of the 
Combination Group, but in interpreting the significance of these 
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results it should be borne in mind that the Combination Group had 
a thorough review just prior to the final test and also that the In- 
structor Group actually covered a larger range of material. 

That there may have been information gained by the pupil 
method which the test did not attempt to measure is true, of course. 
The experiment was limited to the acquiring of certain definite 
fundamental information; outside of this nothing was measured. 
However, it would seem that if the facts are used by the pupil in 
actual situations under his own stimulus they should ‘‘stick’’ as 
well, if not better, than when given to him in “‘ predigested doses’’ 
as was done by the instructor method. The experiment does not 
indicate that this was so. 

A comparison of the standard deviations of the gains is very 
suggestive. That for the Pupil Group is 18.6 while for the In- 
structor Group it is 9.8—almost half. Even if this large differ- 
ence is compared with the initial standard deviation of these groups 
it is still significant and suggests the possibility that the instructor 
method has about the same value for all types of pupils while in 
the Pupil Group the method has a varying value depending upon 
the type of pupil with whom it is used. Where all the work is 
done by the instructor there is little opportunity left for the play 
of individual differences. In the Pupil Group there is abundant 
opportunity for this and hence the group becomes more widely 
different as to achievement. 

A little light is thrown on this problem by a study of the in- 
dividual eases. Figure 3 shows the relative changes in rank of the 
various pupils in the Instructor and Pupil Groups. These are 
quite suggestive. In the Instructor Group many pupils retained 
the same rank, while the changes, with one exception, were so small 
as to be of little or no significance. Compare with this the changes 
in the Pupil Group where only one pupil retains the same position 
and many show very large changes. 

Two eases in the Pupil Group are worthy of note. The pupil 
who ranked second on the initial test ranked twelfth on the final 
test with an actual loss in information of eight points! This pupil 
is a girl of more than average ability, as measured by mental tests, 
but with little ability to concentrate or to apply herself. She is 
somewhat lazy and indifferent and requires constant supervision. 
The method seems totally unsuited for pupils of this type. The 
pupil who ranked fourteenth on the initial test ranked first on the 
final test with a gain of eighty points. This pupil is a boy who 
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always measures below average on mental tests and yet does good 
school work, particularly in natural science. He has unusual power 
of application which shows itself in all his school work. He was 
especially interested in his work in chemistry and was very anxious 
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Figure 3. Graphs showing changes in rank of the pupils in 
these groups based on initial and final testing 


to get to the project work. The pupil method seemed well adapted 
to this pupil. Besides making the best showing in the class on the 
final test there is no doubt but that he obtained much that was not 
measured by this test. This is the pupil who finished the work 
first—about the middle of the semester. 
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If these two cases represent a general situation and the indica- 
tions in this experiment are that they do, the type of classroom 
procedure should vary with the type of pupils in the group. Can 
this condition be met only by methods of individual instruction? 
Or, ean it be met by placing the pupils in groups which are homo- 
geneous enough so that a certain method could be profitably used 
in one group? In such grouping it seems evident from this study 
that mental ability differences are not the only ones to be considered. 

The problems herein involved are of the utmost practical im- 
portanee for all school people. The writers of this article are par- 
ticularly interested in the problem of the adaptation of classroom 
procedure to needs of the pupil and plan further investigations 
along these lines. 

A number of further problems arise out of the methods used 
in this investigation and the results obtained. A few which are 
worthy of consideration are given here. (1) What is the value of 
drill and review work in chemistry? The Combination Group had 

large amount of this while the Instructor Group had none, yet 
the former group failed to register as large gain as the latter. (2) 
What are the factors that operated to enable the Instructor Group 
to make a higher score on the final test? The Combination Group 
was exposed to a thorough review of the semester’s work just prior 
to the final test. This was not done in the case of the Instructor 
Group, the test being given some two weeks or more after the regu- 
ir work was completed. (38) It was noted that in the Pupil Group 
a few failed to complete the required work for the semester. From 
the standpoint of acquiring information, is this a weakness of this 
method of teaching or, since many did finish the work, could the 
apparent weakness be overcome by homogeneous grouping as sug- 
gested above? 
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A DILEMMA IN POPULAR EDUCATION 


Our educational system, especially in its upper reaches, devel- 
oped at a time when only a small and highly selected portion of the 
general population were likely to become students. As a conse- 
quence our educational system now trails with it many things of 
earlier days which are not suited to a less select student body. 
Terman told us some years ago that a full third of the pupils of 
our publie schools could not negotiate the course of study in the 
high school; perhaps an additional third could negotiate it but 
might better be doing something else. Shall we change the high 
school or rail at the child? The favorite indoor sport of high-schoo! 
teachers is to find fault with the child, yet the child is what this 
school business is all about. If the child fails to fit our present 
arrangements, then our arrangements are wrong as far as he is con- 
cerned. If a very considerable number of children who offer them- 
selves for our educational ministrations fail to grow in grace be- 
cause of our efforts, then our efforts are misguided. It is as futile 
to belabor those who seek our guidance as it is for a tailor to find 
fault with the human figure. Fitting the figure is the tailor’s job 
as the adaptation of education to the needs of children and young 
people is the teacher’s job. 

The tremendous demand for edueation gives rise to mass in- 
struction, and this fact brings with it a train of difficulties. There 
is some evidence that the teacher in the classroom is doing the job 
reasonably well, that the real fault is partly administrative and 
partly social. We simply do not know how to cope with mass in- 
struction, such as has developed in our larger cities, without ineur- 
ring the danger of getting a very inferior quality of work or of 
ruthlessly slaughtering pupils. 

This is our dilemma. Shall we reduce requirements still fur- 
ther thereby enabling more high-school pupils and more college 
students to appear to qualify, or shall we maintain high standards 
of scholarship and flunk the students who do not meet them? 
Neither alternative is attractive. If we make education easy for 
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everybody, if we adopt a still more attentuated system of minimum 
essentials, we shall wind up with an education which has lost its 
value. In bringing our offerings within the powers of those of 
ess-than-average ability we shall have robbed these offerings of 
everything which made them vital, energizing, and challenging to 
more highly endowed individuals. 

If we maintain the standards of scholarship to which we are 
o much devoted, our high schools will remain class schools. In 
our judgment we must keep our scholarship standards for those 

whom scholarship is a possibility. The training of leaders is 
still a funetion of our educational institution; yet it is equally 
clear that standards must be lowered if schooling is to be of real 
henefit to those of mediocre endowment. 

Studies within the elementary school have shown with apparent 
clearness that we must either revamp our entire system of instrue- 
tion giving it greater variety, flexibility, and adaptability or defi- 
nitely exeuse multitudes of children from large areas of the eur- 
rieculum. It may be that the Detroit X-Y-Z plan, or something like 

will have permanent value. Yet many a superintendent has 
found after years of experimenting with this plan that the most 
erfect initial classification may be wrecked after one or two 
semesters by its own administrative difficulties. The Morrison plan 
or the Winnetka plan, both extreme forms of organization, may 
bring us eventually to a solution. 

The dilemma which we are suggesting leads further than is 
mmediately apparent. Teachers are being asked on the one hand 
to handle masses of mediocre pupils, to open their classroom doors, 

s well as their hearts, to an increasing number of subnormal chil- 
dren. On the other hand, these teachers are asked to face a new 


testing program—a program which sets up standards, which is 


rigid, uncompromising, and exacting, and which makes too little 
allowance for individual differences. We shall continue to have 
these masses of mediocre pupils and they will beat upon the portals 
of our high schools and colleges in increasing numbers. One ean 
no more stop this than one can check the rising tide. One should 
not want to stop it. Beside the question of this upward surge of 
humanity the bleatings of the pedagogue for his departed stand- 
irds are inconsequential, 

We have mentioned individual differences—the largest idea in 
education. If our testing programs are directed toward revealing 
hese differences rather than covering them up, or vainly trying 
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to level them, we shall be making the most fundamental use of our 
measuring instruments. Then, if our philosophy of education and 
our administrative wisdom permit us to give the varying treatment 
which these individual differences suggest, we may keep our stand. 
ards for the stimulation of those who can reach them. But we 
shall have other equally valid standards each differing one from an- 


other not only in amount but in character. We shall have the wis. 
dom to recognize, among other things, that while it may be inad 


visable to admit everyone to college it may be advisable to offer to 


everyone of college age an educative environment. 
With a little more wisdom, greater tolerance, and a sounde: 
philosophy, the dilemma between standards and mass education 


may be solved with the astonishing result that we may keep our 


standards for those to whom they apply, create new standards for 
new situations, and weleome for more complete training the youth 
who aspire to it. 


B. R. B 
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At the last annual meeting of the Board of Trustees, of Teachers College, 
imbia University, two members of the Research Association were promoted 
full professorships in that institution. The two men so honored were Wil- 


\. McCall and Samuel H. Powers. 


Report of the Board of Education of the District of Columbia is a 94-page 
tin of which about one-sixth is devoted to a report of educational re- 
‘h. Data in regard to individual differences among children, the use of 
telligence and educational tests, and the organization of surveys within the 


point clearly to progress in this sort of work. 


There has recently come to my desk a bulletin by F. P. Obrien, director 
Educational Research Bureau, University of Kansas, entitled ‘‘ An Ex- 
ent in the Supervision of English.’’ The bulletin, dated June, 1926, did 
reach this office until March, 1927. The report deals with a definite ex 
ment carried on in two schools near the University of Kansas. The re- 


ts will be of special interest to teachers of English and to supervisors. 


The state-wide survey in arithmetic, made by the Bureau of Educational 
Reference and Research, University of Michigan, during October, 1926, used 
the Woody-McCall Tesf in Mixed Fundamentals. Dr. Woody, director of the 
Bureau, reports the Michigan achievement markedly superior to Woody-McCall 
standards though the achievement is slightly less than that recorded for the 


receding year. 


The January number of the Education Bulletin of the State of New Jer 
‘ontains an interesting report of ‘‘An Adventure in Student Self-Govern- 
nent in Plainfield, New Jersey.’’ While the principal and teachers were ab 
sent attending a teachers’ convention, the student council assumed charge and 
upil leaders conducted classes for one day. On the whole, the experiment 
ms to have been a decided success. 

The Research Bulletin of the Kalamazoo Public Schools for February 
eports the results of a spelling test in the junior high school, diseusses read 
ng disability, and presents an outline of the extent to which tests are being 

in the city schools, as well as two communications relative to certain 
phases of the work within the system. The following note, taken from the 
latter, may be of interest: 

All materials relative to a child, such as test papers, progress cards, notes 
from parents, doctors’ certificates, ete., are placed on file in a single folder. 
In this way information concerning any child can be easily and quickly 
obtained. 
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Kansas State Teachers College is continuing to sponsor the state scholar 
ship meets. The fourteenth music contest was held from April 26 to 29; th 
high-school art exhibit, from April 25 to 30; and the fifth state scholarship 
meet, on April 30. Dr. E. R. Wood, of the Bureau of Educational Measure 
ments and Standards, was in general charge of the contest. 


In reply to a questionnaire sent out by the School of Education, Uni 
versity of Oklahoma, which sought information regarding the organization of 
teacher training in universities and state colleges, not including normal schools 
and state teachers’ colleges, seventy-five institutions report as follows: 


Number of 
Institutions 


i Pe... sac stct addon senckevanbe 24 
en GE Cn 5 écedscsneedudekasennee 9 
Educational departments 

(usually in College of Arts and Sciences)........ 21 
GreaGeate semecie Of GOmCRtIOm. . occ cceccccccceseses 2 
DEG DEE sc). Gan bee econ bee Ch wa Ok 19 


Of the four-year organizations, 11 are designated as schools of education; 
8, colleges of education; 3, teachers’ colleges; and 1 each, school for vocationa! 
training and department of education, which was explained as corresponding 
to a college in practically all particulars. 


The Summer Session and the Extension Service of the State University of 
Washington, in cooperation with the American Library Association and th 
Department of Elementary-School Principals of the National Education Asso 


, 


ciation will offer a course on the ‘‘ Elementary-School Library’’ this summ: 
immediately after the N. E. A. meeting. The course is organized upon the 
intensive institute plan and will consist of one hour of general lecture and two 
hours of class instruction daily from July 11 to 22. Mr. J. T. Jennings 
librarian of the Seattle Public Library, and Mr. Joy E. Morgan, of the N.E.A., 
will give the general lectures, while the class instruction will be under th: 
direction of Miss Lucile Fargo, of the Board of Education for Librariansh); 
of the American Library Association, and William A. King, principal of on: 
of the elementary schools of Seattle. It is intended that this type of cours 
shall be given biennially in the future in order to crystallize the best thought of 
the country regarding one of the most important features of the new elementar) 
school. Inquiries concerning the course should be addressed to Dr. A. © 
Roberts, dean of the Summer Quarter, University of Washington, Seattle. 


The Report of the Student Committee of Seventeen at Purdue University 
is published by The Division of Educational Reference as ‘‘Studies in Higher 
Edueation, No. VI.’’ At the request of a small group of students this com 
mittee was appointed by the president of the institution and the report covers 
the general fields of the purpose of the institution, the instruction, and the 
activities, 

Each of these students’ reports which have come out of the various insti 
tutions during the last two years shows clearly that the undergraduate body has 
some ideas about higher education which are worthy of contemplation b) 
faculty and administrative officers. 
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Although interest in student participation fluctuates year by year, the 


litor thought it might be of interest to a number of our readers to know 


something of the organization and activities of a plan of student participation 


which has been used for a number of years by the senior high school of Mitchell, 
South Dakota. He therefore asked Superintendent Lindsey to write a brief 

mmuniecation on this subject. The following is an outlined statement by 
Superintendent Lindsey which, the editor feels, is an exceedingly modest state- 
nent of the situation as it exists. 


A Scheme of Student Participation 
I. History 
1. Originated in February, 1918. 
2. Been in continuous operation since that date. 
Organization 
1. Student council of nine members 
a) Four senior representatives including president of class and at 
least one girl and one boy. 
b) Three junior representatives including president of class, one 
girl and one boy. 
Two sophomore representatives including president of class; one 
must be a girl. 
etion to council 
Petition signed by ten members of the class constitutes a 
nomination. 
Names of those nominated are printed on special ballots and 
elections are held in the homerooms. 
Officers chosen at first meeting of council 
a) Chairman 
b) Viee-chairman 
c) Secretary 
Activities of student council 
1. General disciplinary matters 
a) Cases of discipline which, with consent of principal, council 
wishes to take up, but practically no instances calling for action. 
Special duties 
a) Control of study hall during seven periods of day throughout 
year. Council chooses from each study period a student and two 
alternates who have charge during that period. A teacher, who 
remains in charge to some extent throughout the year, has 
entire supervision during the first six weeks of the school year 
She alone takes roll and issues permits to other rooms or to the 
library. 
b) Council selects student library assistant for each period, during 
two of which student has complete charge. 
Miscellaneous suggestions for organization 
l. At the beginning of the school year, have the students discuss the 
system in their home-room groups to determine whether they wish 
to continue the system and to have them give the matter some 
thought out of which possible suggestions for improvement may 
come. 
lhe principal should put responsibility on a student, but know 
what is going on all the time. 
It is important that the students in charge do not allow any stu 
dents to leave the study hall. 
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If the plan is not working well, the principal should meet with 
the council and get them to feel the responsibility for the success 
of the program. 

It is well, at the beginning of the operation of the system, to hav 
the members of the student council on the platform and for som 
of them to talk to the student body regarding the plan. 

A disorderly study period is usually caused by a few students. It 
sometimes is well to take this small number of students out of the 
study hall and put them in a vacant classroom under some teacher, 192 
preferably the one who is advisor for that study period. The 
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B. Advantages and Objections to Scheme vard. al 
I. Benefits to be derived from such a plan : 

1. Gives students practice in applying ideas of citizenship in classes 

2. Improves the relation existing between students and administrative 


mental | 
tion, ari 
officers. in whiel 
Develops the feeling of group responsibility. same ag 
Develops leadership on the part of several students. a dividu 
Gives the faculty an insight into student delinquencies not usually this tra 
known to them. : 
ections to be encountered CAVD 
Students sometimes hesitate to report on each other. growth 
Students show favoritism, unless this is carefully guarded against noint in 
The right type of student leadership is not always to be found in , 
each study hall. 
The student in charge, unless carefully checked, will assume too 
much freedom for himself. Students in charge have been known well as « 
to put substitutes in their places and leave the building. he will } 
Conclusions easual r 
1. Whether the lack of any disciplinary problems outside of the stud; 
hall is due to the system in vogue is possibly hard to prove, but it 
appears to have much to do with this wholesome condition. 
While the condition in the study hall at times is open to criticism that is 
on the whole it is as good as can be obtained with teachers in judge w 
charge, and in a majority of the periods is as good as any teacher tm ebiiin 
could obtain. 
The advantages gained in developing a wholesome spirit betwee 
students and faculty is an additional argument in favor of r not mak 
taining the system. upon wh 
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THORNDIKE, E. L., BregmMan, E. O., Cops, M. V., AND OTHERS. The Measure- 
ment of Intelligence. New York, Teachers College, Columbia University, 


= 
= 
—s 
= 
= 
= 
= 


1926. 616 pp. 

The Measurement of Intelligence by Thorndike, Bregman, Cobb, Wood- 
vard, and others reports the results of many and extensive investigations of 
mental phenomena. After defining ‘‘intellect CAVD’’ (composed of comple- 
tion, arithmetic, vocabulary, and directions tasks) they investigate the units 

which it is to be measured, the distribution of it in individuals of the 
same age, and of the same grade, and the distribution of evidences of it in 
individuals at different times. They measure an individual’s ‘‘altitude’’ in 
this trait and suggest a means of measuring his ‘‘width’’ and ‘‘area’’ of 
CAVD or other intellect. They report upon growth in altitude and infer 
growth in width. Though they report a tentative determination of a zero 
point in the ‘‘truly equal units’’ in which they measure altitude they also 
suggest that it may be greatly in error. They do not supply a syllabus of 
terms used and if one seeks to ascertain the meaning of ‘‘truly equal’’ as 
well as of many other expressions without reading through the book in sequence 
he will probably find it something of a task, but the book is not meant for the 
easual reader or to be used as an elementary text. It is written as a guide 
and handbook for the serious devisor of mental measurements and for the 
nvestigator into the basic relationships of mental phenomena. There is much 
that is new in fact and much that is original in outlook, and it is hard to 
judge what is its most valuable contribution. Thorndike’s individual ability 
to utilize a large amount of data for a single purpose and to keep many facts 

mind with reference to a single end is manifest throughout. This does 
not make for simple reading, but it does provide a well-buttressed foundation 
ipon which to build. The pure mathematician would probably be annoyed by 

s method, for he would ask for a few axioms to be accepted in toto and 
huild-thereon. If this latter approach is thought of as tying a structure to 

rock Thorndike’s may be thought of as a building with so broad a founda- 
tion that if one or another of the many underpinnings fails the structure still 
stands. For example, should the zero point mentioned be proved five or ten 
times too low (from the mental age of six) it would affect his width argu 
ment a trifle, his truly equal units not at all, his determination of form of 
listribution not at all, ete. Thorndike’s approach is the only possible ap- 
proach with our present state of knowledge, for it would surely be foolhardy 
to pin one’s faith upon any single axiom of mental life or structure that has 
been thus far proposed. 

The authors establish by a wealth of data the normal, or near-normal, 
listribution as descriptive of the distribution of intellect for certain age 
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groups and certain grade groups. One common element runs through all of 
this evidence though its presence is at times quite obscure. We find on page 483 
the statement: 

The form of distribution of the varying abilities of the individuals in 
group . . . . may be ascertained with a high degree of probability by sub 
mitting the individual or group to many graded series of tasks, each series 
heing made with the intent to have tasks spaced at equal intervals of dif. 
ficulty. . . . . The seore to be given is a level score. : 

The reviewer believes the crux of the situation is to be found in the words 
which he has italicized. If this has been the usual intent then test units will 
tend to reflect units of sense difference. Having demonstrated that certair 
distributions are normal in the sense just indicated, the authors adopt the 
normal distribution and scale, or transform, the units of their own and many 
other tests into ‘‘truly equal units.’’ The results are highly valuable because 
in addition to being comparable from test to test they are intrinsically highly 
reasonable. 

The distinction between level or altitude and width of intellect seems now 
promising, now the reverse. If, as several times suggested, there is high— 
nearly perfeet—correlation between the two, distinction can hardly be of much 
psychological importance, whereas, if, as indicated in chapter xvi, the curw 
of growth in the one is very different from that in the other then the distinc 
tion is important. The reviewer does not see how both of these things can 
be true at the same time. The importance of the altitude concept is universally 
recognized. The value of the width concept can only be tested when adequate 
measures of it, if such are possible, are made available. 

The authors clearly establish a radical difference between speed and alti 
tude. They are, by contrast with most intelligence testers, delightfully mod 
est as to claims of completeness with which their test measures intelligence 
They have in fact written a tome of six hundred pages without at any point 
implying that their best measures do anything more than measure a specific 
function represented by their test exercises. By consistently referring to in 
tellect CAVD, GOPI, CAVDOSR, ete., no matter how awkward the style, thes 
have adopted the only procedure which claims neither too much nor too littl 
They establish, and herein they are in a firmer position than any earlier test 
devisors, the essential uniformity of their measure CAVD from low to high 
levels. They have seen the possibility of doing this and pointed out its neces 
sity along with many other things fully as important and as uncommon. The 
book should be thoroughly studied by every devisor of mental tests and by 
every interpreter who is concerned with the foundations upon which mental 
measurement rests. 

TRUMAN L. KELLEY 
Stanford University 


Coox, Wruu1AmM A. Federal and State School Administration. New York, 
Thomas Y. Crowell Company, 1927. xvi + 373 pp. 


Dr. Cook makes no attempt to outline in detail the organization upon 
which federal and state school administration is or should be based. His ob 
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t has been to present the historical background and the logical bases for 

ol and state control in education. This he has done in an admirable 

nner. His style is racy, pointed, and decidedly interesting. Facts, his- 

rieal and current, fill the volume from cover to cover and yet it is entirely 
radable. 

The first third of the book tells a story of federal participation in and 

to education which far surpasses our common conception. The pros and 

ns with reference to recently attempted federal legislation concerning a 

federal department of education are well restated. Within the compass of 

twenty-five pages the author deals with state educational machinery and most 


se Ties 
, dif 


vords 


the remaining pages of this virile volume discuss somewhat in detail defi- 
functions which have been, or are being, assumed by state departments 
edueation. The need for and the difficulties involved in such control are 
early stated. 
Dr. Cook has been successful in presenting in a pointed way aspects of 
ral and state administration about which every classroom teacher should 
informed. Certainly, if administrative progress is to be made in our pub- 
schools, there should be in every community those who are familiar with 
resent tendencies, who know of past practices and who will be able to sat- 
sfy the mind of the laymen with respect to the advisability or inadvisability 
ertain proposed reforms. As Dr. Cook suggests, who is in a position to 
rform such a service better than the teacher? 
A. O. Heck 


Ohio State University 


KINGHAM, B, R., AND OssuRN, W. J. The Buckingham-Osburn Searchliaht 

irithmetics: Introductory Book. Boston, Ginn and Company, 1927. 

xv -+ 381 pp. 

The introductory book of the Buckingham-Osburn Searchlight Arithmetics 

itten especially for the teachers of the first and second grades. Besides a 
full and complete discussion of the methods chosen by the authors, it contains 
sufficient practice material for the children. In fact, it contains the material 
or-a complete course. While written with no particular curriculum in mind, 


t will be easily adjusted to any of the reasonably satisfactory courses of study. 


Since both authors are directors of research bureaus, the reader will expect 
them to make use of experimental data, and he will not be disappointed. 
Knowing the habits and skills which the child must master in order to cover 
ceurately and rapidly the work of later grades, the authors have sought 
through method and material to prevent error in the development of these 
necessary habits and skills. The teacher who uses this book will know not only 
what to do but the reasons for so doing. 

The order of presenting the combinations has been carefully worked out. 
Three fundamentals principles guide this procedure: the prevention of count 
ing by which no two combinations involving the same number are presented 
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together; the reducing of interference by which no two combinations con- 
taining the same addend are taught together; and the recognition of the 
difficulty of the various combinations. 

Based upon these principles which are the products of research, the 
authors have divided the 100 combinations into an easy and a difficult group. 
Each group is further subdivided into teaching units. The first two teach- 
ing units are 242—4, 4—2=—2, and 144=—5, 44+1=5, 5—1=—4, 
5—4=1. The complete form of each combination is to be given. This pre- 
vents the habit of counting and tends to set up an ideal of accuracy. For each 
teaching unit the authors have worked out a difficulty index which determines, 
to a large extent, the order of the presentation. 

In subtraction the authors adopt the take-away method. They point out 
that there are problems and situations in life which favor the additive method 
just as well as problems and situations which favor the take-away procedure, 
The scientific evidence seems to be as favorable to one as to the other. 

The additive method is not favored because of several reasons, among 
which are: (1) In long subtraction the method leads to a procedure which 
is not straightforward. (2) The additive method is not generally the method 
of making change. (3) The additive method provides no adequate means of 
checking. 

Recent investigation has shown the difficulty on the part of many pupils 
with the zero combinations; the authors devote one chapter to the teaching 
of these combinations. The four chapters devoted to games and devices will 
be especially valuable to the beginning teacher. 

The entire book is not only based on research but has been subjected to 
classroom use. It was the reviewer’s privilege to follow it through in sev- 
eral classes, and, in his opinion, those who are supervising or teaching primary 
arithmetic cannot afford to do without it. The book must be read to be fully 
appreciated, for it is impossible to do justice to it in a short review. 

B. W. DEBusK 


Director, Bureau of Research 
Portland, Oregon 
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